Repeat append algorithm

Steve Hutchesson · December 09, 2009, 01:10:50 PM

A common problem in almost every dialect of basic is that if you use the legacy basic functions to repeatedly append text or binary data to the end of a current string buffer, it gets slower with each append, its nothing wrong with the basic dialects that supply this legacy capacity but a fundamental limitation in the language specification. The problem is that by its design basic creates a new string each time the append is done and this involves copying the content of the old string and the new string to another new string.

At its worst the shorter the appended string and the more times it has to be appended the slower it gets by the amount of data it must copy. There has always been a solution but it involved knowing how big the result would be ande this is not always an easy task. The following algo makes the equation easily adjustable and it will auto-realloc the memory by way of a normal basic string if the next append would go past the end of the current buffer.

Its design is a trade off between initial memory allocation, repeat reallocation size and speed. The fastest technique is to allocate a buffer big enough to handle the result so that no reallocation occurs and it runs at about the limit of data append at a byte level. It is written to have a minimum 1 megabyte buffer size which it will realloc if the start size is smaller but it will be a lot faster if you set the realloc size to a much larger size.

If the added string size was of reasonable size it would be an attractive option to align the writes at a byte level until it was 4 byte aligned and then proceed to do the writes at a DWORD level but almost exclusively the data length being appended is not very long and a byte level append is a lot less messy and clunky to code reliably.

Below is the test piece.

Code Select


#IF 0  ' ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    Flexible repeat append basic string algorithm
    Buiild with PBCC50

#ENDIF ' ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    DECLARE FUNCTION GetTickCount LIB "KERNEL32.DLL" ALIAS "GetTickCount" () AS DWORD

' ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

FUNCTION PBmain as LONG

    LOCAL cloc as DWORD
    LOCAL lpcn as DWORD
    LOCAL tcnt as DWORD

    buffer$ = ""
    ' buffer$ = nul$(1024*1024*128)

    addin$ = "12345678901234567890123456789012345678901234567890"+_
             "12345678901234567890123456789012345678901234567890" ' 100 characters

    tcnt = GetTickCount
  ' ------------------------------------------------------
    cloc = 0    ' start at beginning of buffer
    lpcn = 0    ' zero the loop counter

    Do
      cloc = repappend(buffer$,addin$,cloc,1024*1024)
      ! add lpcn, 1
    Loop while lpcn < 1000000               ' 1 million appends at 100 bytes each

    buffer$ = left$(buffer$,cloc)           ' trim off extra memory
  ' ------------------------------------------------------
    tcnt = GetTickCount - tcnt

    StdOut "Duration =" + str$(tcnt)

    StdOut str$(cloc)                       ' display the written byte length

    Stdout chr$(13,10)+"Press any key to exit ...."
    Do
      Sleep 1
    Loop while inkey$ = ""

    FUNCTION = 0

End FUNCTION

#IF 0  ' ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    ---------------------------------------

    buffer$ = ""
    ' buffer$ = nul$(1024*1024*128)

    addin$ = "12345678901234567890123456789012345678901234567890"+_
             "12345678901234567890123456789012345678901234567890" ' 100 characters

    cloc = 0    ' start at beginning of buffer
    lpcn = 0    ' zero the loop counter

    Do
      cloc = repappend(buffer$,addon$,cloc,1024*1024)
      ! add lpcn, 1
    Loop while lpcn < 1000000               ' 1 million appends at 100 bytes each

    buffer$ = left$(buffer$,cloc)           ' trim off extra memory

    ---------------------------------------

    NOTES

      The algorithm addresses a fundamental problem with sequential string appends in
      basic, the reallocation and copy of memory each time the memory is reallocated.

      This algorithm is at it fastest when the original buffer to receive the appended
      data is large enough to hold it all without any reallocation of memory. It becomes
      progressively slower as the buffer size is decreased as more reallocations must be
      done. This is adjustable with the algorithm in two (2) ways.

      1. The larger the initial buffer is, the less times the memory is reallocated
      2. The larger the extra buffer size is the less times the memory is reallocated

      When the append string operation is completed, trim off any excess memory using
      the last return value from the function using the LEFT$() standard basic string
      function.

#ENDIF ' ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

FUNCTION repappend(src$,addit$,ByVal cloc as DWORD,ByVal iextra as DWORD) as DWORD

    #REGISTER NONE

  ' ------------------------------------------------------
  ' src$        the main buffer to append extra data to.
  ' addit$      the byte data to append to the main buffer
  ' cloc        current location pointer
  ' iextra      size of increase to buffer, min 1 meg
  ' ------------------------------------------------------

    LOCAL pstr as DWORD

    pstr = StrPtr(src$)
    ! mov esi, pstr

    pstr = StrPtr(addit$)
    ! mov edi, pstr

    ! mov ebx, [esi-4]          ' stored buffer length
    ! mov eax, cloc
    ! add eax, [edi-4]          ' write location counter AND addit$ length to EAX

    ! cmp ebx, eax              ' test if combined length greater than source buffer size
    ! jg nxt1

    ! cmp iextra, 1024*1024     ' if realloc, check buffer size
    ! jge nxt0
    ! mov iextra, 1024*1024     ' set a minimum extra buffer size

  nxt0:
      extra$ = nul$(iextra)     ' allocate a buffer of "iextra" size
      src$ = src$ + extra$      ' realloc source string size
      pstr = StrPtr(src$)
    ! mov esi, pstr

  nxt1:
    ! mov ecx, 1
    ! or eax, -1
    ! add esi, cloc             ' add current location pointer for next copy position

  ' -----------
  ' unroll by 2
  ' -----------
  lbl0:
    ! add eax, ecx
    ! movzx edx, BYTE PTR [edi+eax]
    ! mov [esi+eax], dl
    ! test edx, edx
    ! jz lbl1                   ' exit on written terminator

    ! add eax, ecx
    ! movzx ebx, BYTE PTR [edi+eax]
    ! mov [esi+eax], bl
    ! test ebx, ebx
    ! jnz lbl0                  ' exit on written terminator

  lbl1:
    ! add eax, cloc             ' location
    ! mov FUNCTION, eax         ' return updated offset pointer

END FUNCTION

' ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

Frederick J. Harris · December 12, 2009, 07:27:47 PM

I've often wondered Steve if PowerBASIC does a re-allocation with every new string concatenation; or if it uses some alternate method like doubling the allocation at each concatenation and keeping track of the present memory allocation size. Recently I modified my C++ string class to operate that way and it certainly cuts down on re-allocations and string copies which are really the devil - as you know.

Steve Hutchesson · December 13, 2009, 06:13:30 PM

Fred,

I think its that case of how basic is specified in the first place.

a$ = a$ + b$

This common style of code requires the standard append operator "+" to check the length of both a$ and b$, allocate a new string then copy a$ and B$ to it. It plenty fast enough for common small tasks but sequential append on a large scale finds the limit to that type of design very quickly. Its interesting tha C++ does it the same way as basic with the same limitation where C traditionally would have allocated a buffer and resized it on a needs basis.

In very old DOS basics from Microsoft it was usually faster to open a file and write sequentially to the file as the disk operating system managed the data output by extending the file write up to the limits of disk size but in a modern era of far larger linear memory addressing a direct memory string construction technique is a lot faster.

News:

Repeat append algorithm

Steve Hutchesson

Frederick J. Harris

Steve Hutchesson