Hello Theo,
I hope it is ok to post in your area?
I would very much like to find out more about dimensioning an ARRAY vs allocating memory for nodes in a linked list.
Typically this becomes noticable for STRING arrays of 10k elements if the individuale strings are around 2k each.
I would like to know if using a Linked List of Strings is significantly different in speed when created like this:
'¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
FUNCTION llbAdd ALIAS "llbAdd" ( BYVAL pFirst AS LINKED_LIST_TYPE PTR, BYVAL sData AS STRING ) EXPORT AS DWORD
' Creates a linked list node of BINARY sData AS STRING (CHR$(0) are ok) and adds it to the end of the linked list
LOCAL pNext, pLast AS LINKED_LIST_TYPE PTR
IF gLenllType = 0 THEN gLenllType = LEN(@pNext) ' Set once
pNext = GetMem(gLenllType) ' Allocate the memory for the linked list node
IF pNext = 0 THEN
gsllLastError = "Node GetMem Failed in llbAdd()"
FUNCTION = -1
EXIT FUNCTION
END IF
@pNext.DataLen = LEN(sData)
@pNext.pData = GetMem(@pNext.DataLen) ' Allocate the memory for the Asciiz Data string
IF @pNext.pData = 0 THEN
gsllLastError = "Data GetMem Failed in llbAdd()"
FreeMem pNext
FUNCTION = -2
EXIT FUNCTION
END IF
CopyMemory @pNext.pData, BYVAL STRPTR(sData), @pNext.DataLen
IF pFirst THEN
pLast = llFindLast(BYVAL pFirst) ' Find Last Node
IF pLast THEN @pLast.pNext = pNext
FUNCTION = pFirst ' Return the sent first node
ELSE
FUNCTION = pNext ' Return the new first node
END IF
END FUNCTION
'¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
FUNCTION llFindLast ALIAS "llFindLast" ( BYVAL pFirst AS LINKED_LIST_TYPE PTR ) EXPORT AS DWORD
' Returns a pointer to the last node in the list
LOCAL pLast AS LINKED_LIST_TYPE PTR
IF pFirst = 0 THEN '
FUNCTION = 0
EXIT FUNCTION
END IF
pLast = pFirst
DO UNTIL @pLast.pNext = 0 ' Search through the nodes looking for the one with pNext NOT set
pLast = @pLast.pNext ' Return the one before
LOOP
FUNCTION = pLast
END FUNCTION
'¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
FUNCTION GetMem( BYVAL iBytes AS DWORD ) AS DWORD
LOCAL pMem AS DWORD
pMem = GlobalAlloc(%GMEM_FIXED OR %GMEM_ZEROINIT, iBytes) ' Allocates fixed memory. Initializes memory contents to zero
IF pMem THEN
FUNCTION = pMem
ELSE ' Try Again
pMem = GlobalAlloc(%GMEM_FIXED OR %GMEM_ZEROINIT, iBytes)
IF pMem THEN
FUNCTION = pMem ' "Success this time"
ELSE
FUNCTION = -1 ' "GetMem FAILURE AGAIN"
END IF
END IF
END FUNCTION
If we leave out the overhead of COPYMEMORY() and llFindLast() for the sake of discussion, is the allocation of memory for the node faster or slower than using DIM MyArray(9999)?
(full Linked List code available from Don Dickinson at www.greatwebdivide.com)
Mike, I can't answer your question directly, but I ran across this interesting site for c++ coding that finally explains linked lists, stacks, queues, vectors in such a way that it all makes sense.
The basic highlights:
Arrays are the fastest.
Linked Lists are the basis for all the other data structure systems like stacks and queues
Linked List gives you total control of putting things in and out of the list anywhere.
Stack is a linked list with Last In / First Out, there are not linked list routines for putting things anywhere in the middle. Think Stack of plates.
Queues are like stacks but First In / First Out, again just think of a line of customers waiting in a line, the first in line gets served first.
A Vector is nothing more than a single dimensional array with methods to make using it easier.
http://www.aaroncox.net/tutorials/miscellaneous/index.html
Great link for an explanation. Thank you.
What I am wondering about is the speed of PB memory allocation using DIM vs ALLOC for the linked list memory allocation. I suspect theo might have a few words to say on the subject :)
Before I can answer, the answer is already given.
While finally its not that easy, as you are talking from "String Arrays" not from Arrays in General.
Using Arrays in general the Memory is been allocated by the BASIC Runtime in one piece.
But not with dynamic String Arrays.
Using dynamic Elements, memory allocation is done for each element separate, they can even be in differen memory areas.
As most of the time will be used with memory management of the idividual strings themselves, I would actually not say that that the rule "Array is significantly faster then linked list" will be true for String Arrays with dynamic strings.
In this case it may depend on what shall be done with the strings. How often they change content or not,
and how the linked list are implemented. Your code looks promissing, anyway the question is, what happens to the data,
how efficient can the CPU Cache handle the memory requests.
Some links:
http://articles.techrepublic.com.com/5100-22-1050183.html
http://cslibrary.stanford.edu/103/LinkedListBasics.pdf
>Using dynamic Elements, memory allocation is done for each element separate, they can even be in differen memory areas.
Exactly.
OK let me write some pseudo code:
sMyArray() AS STRING
DIM sMyArray(9999) ' Step 1
FOR i = 1 TO 9999
sMyArray(i) = SPACE$(1999)' a 2k string - Step 2
NEXT
' Vs
pRoot AS DWORD
pRoot = 0 ' This indicates that we are creating the first node in a new linked list
FOR i = 1 TO 9999
pRoot = llbAdd( pRoot, SPACE$(1999) ) ' On Creating First Node, pRoot becomes "handle" to the first node in the list
NEXT
Using a PB Array might be slower depending on how the memory is allocated for the BSTR dynamic strings by PB (hence the decompile question). With the Linked list, the memory is allocated with:
pMem = GlobalAlloc(%GMEM_FIXED OR %GMEM_ZEROINIT, iBytes)
It would seem that this might be faster?
It may depend on Element size and position.
I could imagine, that the string engine may be faster with smaller Elements, while it may be slower with bigger elements.
But finally you can do a test. In this case an Dissassembly would not help much as most time is spent in system-calls.
If it is a situation where there is a large number of additions and deletions from your list over the life of your program then I suggest that a Linked List is much better than an Array in terms of efficiency and speed. I use Linked Lists for just about everything these days because I find that they are much easier to work with than arrays especially when you need to dynamically create and destroy multiple lists throughout your program.
I posted Linked List code in the PB source code forum. It is similar to Don's but it is a little faster (mine is double linked whereas Don's is single) and it integrates well into the Hash Table code that I also posted.
QuoteBut finally you can do a test. In this case an Dissassembly would not help much as most time is spent in system-calls.
Oh ok. For some reason I thought you would need to dissasemble the Dimension statements for arrays. I can certainly do a test.
Paul,
I looked at your double linked list briefly. I did not immediately see the benefit, but then I am not sure I understand what the double linking does.
I removed the name parameter from the Dons linked list so no strings are harmed in the making of the list (and I loose the ability to retrieve nodes by name, but I never do that)
In this situation (think SQL server) I do not delete nodes or move them around, its just a convenient way to assign data to memory without knowing in advance how many items may be returned (allthough in this specific case I do, I think) but regardless, I would like to use the linked list more and just not care if the number of elements are known.
I agree that the convenience of freeing the memory and essentially destroying the list is great, but I need to test the speed of dimming an anrray and then redimming it, vs creating the list and freeing it.
After a few months working with Linked lists, String Builder class and memory allocation including paged, I have realized there is very little in it in terms of assigning memory. PB strings do just a good a job as Malloc etc. The real speed trick is to allocate all the memory you will need in one initial assignment as happens with an array. I am sure this is no surprise to anyone, but I did wonder about the memory assignment.
Thanks Mike for updating us on your findings after such testing.
Mike,
maybe you already saw it but in http://www.thinbasic.com/index.php?option=com_docman&task=cat_view&gid=27&Itemid=66
I've posted some code called "Dynamic arrays" that maybe can give you another point of view on PB dynamic memory allocation.
Ciao
Eros
Hi Eros,
Yes I looked at this a while ago. Very nice.
I ended up using the String Builder class. Its so simple and elegant.