Page 4 of 5 FirstFirst ... 2345 LastLast
Results 31 to 40 of 45

Thread: Handling huge amounts of data

  1. #31
    Legendary Member cairnswm's Avatar
    Join Date
    Nov 2002
    Location
    Randburg, South Africa
    Posts
    1,537

    Handling huge amounts of data

    You might want to look at one of Jans Delphi Components (http://jansfreeware.com/jfdelphi.htm)

    TjanSQL 1.1
    2-April-2002 size:379kb
    TjanSQL is a single user relational Database engine implemented as a Delphi object using plain text files with semi-colon separated data for data storage. Supported SQL: SELECT (with table joins, field aliases and calculated), UPDATE, INSERT (values and sub-select), DELETE, CREATE TABLE, DROP TABLE, ALTER TABLE, CONNECT TO, COMMIT, WHERE (rich bracketed expression), IN (list or sub query), GROUP BY, HAVING, ORDER BY ( ASC, DESC), nested sub queries, statistics (COUNT, SUM, AVG, MAX, MIN), operators (+,-,*,/, and, or,>,>=,<,<=,=,<>,Like), functions (UPPER, LOWER, TRIM, LEFT, MID, RIGHT, LEN, FIX, SOUNDEX, SQR, SQRT). High performance: complete in-memory handling of tables and recordsets; semi-compiled expressions. Released under MOZILLA PUBLIC LICENSE Version 1.1. NEW FEATURES: fixed memory leak, calculated fields (in select and update statements), field aliases, table aliases, join "unlimited" tables, stdDev aggregate function, ASSIGN TO for named temporary tables, SAVE TABLE for persisting recordsets, INSERT INTO, ISO 8601 dates, numerous extra functions.
    I have never used it but have always sort of kept it in mind for one day when I want a SQL based text file system.
    William Cairns
    My Games: http://www.cairnsgames.co.za (Currently very inactive)
    MyOnline Games: http://TheGameDeveloper.co.za (Currently very inactive)

  2. #32

    Handling huge amounts of data

    ty for support guys

    i forgot to say that cards are around 6000+ (growing) ops:

    so i can't store them all in a collection. I will use collections only for builded decks and so i have to load the text from -somewhere- (ie a text file, .xls or others) and the images from a folder.

    you can undestand that looking for a specific card's text in a file of 6000 cards for a deck >59 (maybe all unique) is really lagged! That's why i asked for sql queries.

    i will take a look to the link above, ty again.
    Will: &quot;Before you learn how to cook a fish you must first learn how to catch a fish.&quot; coolest

  3. #33
    Legendary Member cairnswm's Avatar
    Join Date
    Nov 2002
    Location
    Randburg, South Africa
    Posts
    1,537

    Handling huge amounts of data

    SQL by its nature will always be SLOWER than a custom created binary search. If you take the time to think about it, each SQL query must be parsed for correctness, then 'interpreted' [size=9px](1)[/size] before accessing the data. The data itself may be fragmented accross multiple disk segments etc and will not be the fastest option for single record type searches.

    A binary file using the standard Delphi record structures will always be faster unless it does tooo many disk accesses to find the data.

    Code:
    Type
      TCardRecord = Record
         CardName &#58; String&#91;30&#93;;
         CardText &#58; String&#91;255&#93;;
         ....
       End;
    
    Var
      CardFile &#58; File of TCardRecord;
    Then ensure the file is stored in sorted order, and do a binary search accross the file. (If you want an example just ask).

    Or alternativly include a hash along with the card index into an index file. Hash the card name and then search in the index file to get the card index and access it directly from the Card File.

    Lastly - if you want to show off - create the card file and then index the card name along with the card index into a B+ tree structure as an index. This will be fast, possibly faster than the hash table idea but while I've always wanted to make a B+ tree structure I never have.





    [size=9px](1) I say interpreted but it could be compiled or similar as well.[/size]
    William Cairns
    My Games: http://www.cairnsgames.co.za (Currently very inactive)
    MyOnline Games: http://TheGameDeveloper.co.za (Currently very inactive)

  4. #34

    Handling huge amounts of data

    ok, but for make a binary search on a file mean that i have to load it all into the ram before; and at the moment i can't use a record as you posted since card's text hasn't a fixed lenght in the .xls that i have.
    (if you know a way to save fields with a fixed lenght from excel tell me)
    otherwise i have to read and parse every row :/ or i should make a prog for parse and than convert it as i want

    edited: I have corrected the post, my english sucks!
    Will: &quot;Before you learn how to cook a fish you must first learn how to catch a fish.&quot; coolest

  5. #35
    Legendary Member cairnswm's Avatar
    Join Date
    Nov 2002
    Location
    Randburg, South Africa
    Posts
    1,537

    Handling huge amounts of data

    You can do a binary search using the FileSeek and FilePos functions and do it against the disk instead of in memory.

    If I remember I'll do a little example for you tomorrow - need to go homw and takes kids out now.
    William Cairns
    My Games: http://www.cairnsgames.co.za (Currently very inactive)
    MyOnline Games: http://TheGameDeveloper.co.za (Currently very inactive)

  6. #36

    Handling huge amounts of data

    I’m sorry for crashing in so late to the conversation. From what I understood you want to store and search a “huge” amount of data, so much data that you can’t load it all into memory.
    Therefore you need to store the information sorted with an indexing table, there are many methods to implement that for example:
    You need to use 2 files (or more), data file abd index file(s).
    * Upon adding a new element calculate his hash (based on the things you’re going to look for) and add it to the data file (saving the position).
    * Save the position + hash in the index file.
    In runtime you only need to load the hash file (which is small). When you need to find an element you calculate the hash and jump to the correct location in the data file (it also works if the hash isn’t a unique ID)

    Delete is a small problem as you will encounter fragmentation.. but it can be solved.

    I hope im answering the right thing :?

    Goodluck
    [size=9px]BEGIN GEEK CODE BLOCK
    <br />d s-- : a24 GB GCS GTW GE C++ P L+ W++ N+ K- w++++ M- PS+ PE+ Y- t+ 5+++ X+ R*
    <br />tv b+ DI++ D+ e++ h+ G-
    <br />END GEEK CODE BLOCK[/size]
    <br />Create your own GeekCode block at: <a href="">...</a>

  7. #37
    PGD Community Manager AthenaOfDelphi's Avatar
    Join Date
    Dec 2004
    Location
    South Wales, UK
    Posts
    1,245
    Blog Entries
    2

    Handling huge amounts of data

    [quote="Paizo"]ty for support guys

    i forgot to say that cards are around 6000+ (growing) ops:

    so i can't store them all in a collection. I will use collections only for builded decks and so i have to load the text from -somewhere- (ie a text file, .xls or others) and the images from a folder.

    you can undestand that looking for a specific card's text in a file of 6000 cards for a deck >59 (maybe all unique) is really lagged! That's why i asked for sql queries.

    i will take a look to the ]

    With regards to the speed... I have just done a quick test... I populated a string list with 10000 random strings... each one was 10 characters long, and then ran through the list looking for ABCDEFGHIJ. I also searched each string using pos() for the substr 'AB'. The whole process (population and running through the list) took less than 200ms. Thats on an Athlon 800.

    Searching for strings etc. on in-memory data can be exceedingly fast.

    As for memory requirements... if you say you are going to limit yourself to just 20MB, then each of your 6000 cards can provide 3KB of data and you wouldn't blow your 20MB limit. If a card does provide that much data, can it be optimised once loaded? Link images to cards via integer (4 bytes as opposed to a string which could be anything) for example, that could save a whack of data.
    :: AthenaOfDelphi :: My Blog :: My Software ::

  8. #38

    Handling huge amounts of data

    Quote Originally Posted by AthenaOfDelphi
    ...
    Searching for strings etc. on in-memory data can be exceedingly fast.
    ....

    I agree.
    I think that save some ram space (1mb?) and make the search in the disk isn't good. maybe in future implementation the app will offer the opportunity to make some query on the database and, looking at Athena's test, seems without lag.
    Will: &quot;Before you learn how to cook a fish you must first learn how to catch a fish.&quot; coolest

  9. #39
    Legendary Member cairnswm's Avatar
    Join Date
    Nov 2002
    Location
    Randburg, South Africa
    Posts
    1,537

    Handling huge amounts of data

    In my above post I ment Seek not FileSeek.

    Here is an example of a custom data structure used to store data - along with the binary search funtion - note the data must be inserted in alphabetical order.

    I cannot get times for the search - I've tried files up to 200000 records and all searches give me 0 millisend response times....

    [pascal]unit QuickDB;

    interface

    Uses
    SysUtils;

    Type
    TDataRecord = Record
    Name : String[100];
    Data : Array[1..4] of String [255];
    End;

    Procedure MakeData(FileName : String; NumOfRecord : Integer);
    Function GetRecord(FileName : String; inName : String) : TDataRecord;

    implementation

    Procedure MakeData(FileName : String; NumOfRecord : Integer);
    Var
    I : Integer;
    F : File of TDataRecord;
    DR : TDataRecord;
    Begin
    AssignFile(F,FileName);
    Rewrite(F);
    For I := 0 to NumOfRecord-1 do
    Begin
    DR.Name := 'Rec'+FormatFloat('000000000',I);
    DR.Data[1] := 'Data1';
    DR.Data[2] := 'Data2';
    DR.Data[3] := 'Data3';
    DR.Data[4] := 'Data4';
    Write(F,DR);
    End;
    CloseFile(F);
    End;
    Function GetRecord(FileName : String; inName : String) : TDataRecord;
    Var
    I,H,L,M : Integer;
    F : File of TDataRecord;
    DR : TDataRecord;
    Begin
    L := 0;
    AssignFile(F,FileName);
    Reset(F);
    H := FileSize(F)-1;
    M := (H+L) div 2;
    Seek(F,M);
    Read(F,DR);
    While (DR.Name <> inName) and (L<>H) do
    Begin
    if DR.Name > inName then
    Begin
    H := M-1;
    End
    else
    Begin
    L := M+1;
    End;
    M := (H+L) div 2;
    Seek(F,M);
    Read(F,DR);
    End;
    CloseFile(F);
    Result := DR;
    End;

    end.
    [/pascal]
    William Cairns
    My Games: http://www.cairnsgames.co.za (Currently very inactive)
    MyOnline Games: http://TheGameDeveloper.co.za (Currently very inactive)

  10. #40

    Handling huge amounts of data

    i appreciate your way to solve problems by posting some code
    i will make some test soon.
    Will: &quot;Before you learn how to cook a fish you must first learn how to catch a fish.&quot; coolest

Page 4 of 5 FirstFirst ... 2345 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •