Structuring Inline Assembler

Charles Pegge · October 03, 2007, 01:13:43 PM

When you are familiar with some of the instructions, Inline assembler is actually very easy to write and you wonder why people bother to use high level languages, when there is such a sophisticated CPU at your disposal.

But after 100 or so lines of assembler, the reason becomes apparent as you begin to lose track of the code with all the conditional jumps and new labels that have to be created, This is the same problem that early forms of BASIC had - a lack of block structure.

There are two ways round this:

One is to embed the inline assembler between BASIC block structures, which means dipping in and out of assembler - sharing integer variables with the host system.

The other is to write the assembler sections with block structure labels and then put the source code through a preprocessing stage to turn these structures into unique labels acceptable to the compiler.

This has the advantage of producing more efficient code since you can make use of these registers from start to finish without saving them to variables and handing back to the host BASIC, every time you reach the end of a block.

Because you are no longer dealing directly with the source code, the preprocessor can make debugging more difficult, but this is a small price to pay, to provide structured programming where it is needed.

Here is part of the word parser I am using for R$. It scans the text for the next word, but ignores comments -ascii 124: which are terminated by a line feed).

The block notation supports unlimited nesting:

do
exit
repeat
if
endif
end

BEFORE

Code Select


   asm
  xor eax,eax
  mov edx,[lsi]
  sub edx,[i]
  mov esi,[i]
  add esi,[p]
  dec esi
  `do
   inc esi
   dec edx
   jl `exit
   mov al,[esi]
   cmp al,32
   jle do0
   cmp al,124
   jnz `exit
   `do
    inc esi
    dec edx
    jl end1
    mov al,[esi]
    cmp al,10
    jnz `repeat
   `end
   jmp `repeat
  `end
  sub esi,[p]
  mov [i],esi
  mov [c],al
 end asm

AFTER

Code Select


  asm
  xor eax,eax
  mov edx,[lsi]
  sub edx,[i]
  mov esi,[i]
  add esi,[p]
  dec esi
  do0:
   inc esi
   dec edx
   jl end0
   mov al,[esi]
   cmp al,32
   jle do0
   cmp al,124
   jnz end0
   do1:
    inc esi
    dec edx
    jl end1
    mov al,[esi]
    cmp al,10
    jnz do1
   end1:
   jmp do0
  end0:
  sub esi,[p]
  mov [i],esi
  mov [c],al
 end asm

The preprocessor is itself written in R$, so I am using R$ reflexively to develop R$

The original Basic Routine using string pointer technique

Code Select


' do                                 ' skip spaces
'  if i>=lsi then c=0:exit do        ' 
'  c=p[i]                            '
'  if c=124 then                     ' skip line comment
'   do
'    i+=1:if i>=lsi then c=0:exit do '
'    if p[i]=10 then c=10:exit do    '
'   loop                             '
'   continue do                      '
'  end if                            '
'  if c>32 then exit do              '
'  i+=1                              '
' loop                               '

And here is the R$ code for the preprocessor so far. It also performs other tasks such as date stamping and prog statistics.

|
| R$ POSTFIX LANGUAGE
|
| Charles E V Pegge
|
| Oct 2007

(
13 chr ( 10 chr ) , : crlf
"r$.dat" filein : fi
if ne then
"r$.txt not accessible " ?
0 returns
endif

"rr$.bas" fileout : fo
0 : dimsc
0 : funsc
0 : subsc
0 : asmsc
0 : lopsc
0 : linsc

"" : iflis
"" : dolis
0 : csy

| FOR GENERATING STRUCTURED ASSEBLER LABELS
| if .. endif do .. exit .. repeat .. end
|------------------------------------------
def block_strucs
1 + : p
s p 10 mid read word : w len : lw
w
(
"if" cmp if eq then
"endif" ( csy str ) , " " , iflis , to iflis
"endif" ( csy str ) , : sw
p 1 - : p p lw + 1 + : q
ref s sw p q insert s ?
incr csy
endif
)
(
"do" cmp if eq then
"end" ( csy str ) , " " , dolis , to dolis
"do" ( csy str ) , ":" , : sw
p 1 - : p p lw + 1 + : q
ref s sw p q insert s ?
incr csy
endif
)
(
"endif" cmp if eq then
p 1 - : p p lw + 1 + : q
iflis read word ":" , ref s swap p q insert s ?
iflis " " 1 pos 1 + iflis swap 1000 mid to iflis
endif
)
(
"end" cmp if eq then
p 1 - : p p lw + 1 + : q
dolis read word ":" , ref s swap p q insert s ?
dolis " " 1 pos 1 + dolis swap 1000 mid to dolis
endif
)
(
"exit" cmp if eq then
p 1 - : p p lw + 1 + : q
dolis read word ref s swap p q insert s ?
endif
)
(
"repeat" cmp if eq then
p 1 - : p p lw + 1 + : q
"do" ( dolis read word 4 16 mid ) ,
ref s swap p q insert s ?
endif
)
end | block_strucs

| MAIN LOOP FOR EACH LINE
"" ?
(
fi eof if true then exit endif
fi in : s
( incr linsc )

| STATISTICS TALLY
(
read word : w
( "function" cmp if eq then incr funsc endif )
( "sub" cmp if eq then incr subsc endif )
( "loop" cmp if eq then incr lopsc endif )
(
"end" cmp if eq then
word : w1
(
"asm" cmp if eq then incr asmsc endif
)
endif
)
s
(
| for block structured assembler
'`' 1 pos
if is then
block_strucs
endif
)
(
"dim shared " 1 pos
if true then incr dimsc endif
)
)
"\" 1 pos
if null then
s crlf , fo out repeat
endif

| INFO PATCH-INS
s "\preprocessor" 1 pos : p
if is then
ref s "Preprocessed with R$ self coding manager" p p 13 + insert s ?
endif
s "\date" 1 pos : p
p if is then
date : da
1 2 mid : d
0 : i
" 01 Jan 02 Feb 03 Mar 04 Apr 05 May 06 Jun 07 Jul 08 Aug 09 Sep 10 Oct 11 Nov 12 Dec "
( d 1 pos 3 + to i )
i 3 mid : e
"" ( da 4 2 mid ) , " " , e , " " , ( da 7 4 mid ) , to e
d
ref s e p p 5 + insert s ?
endif
s "\time" 1 pos : p
p if is then
time 1 5 mid : t
ref s t p p 5 + insert s ?
endif
s crlf , fo out
repeat
)
fi close
fo close
"" ?
" STATS:" ?
" Global Variables: " ( dimsc str ) , ?
" Functions: : " ( funsc str ) , ?
" subs: : " ( subsc str ) , ?
" assembler: : " ( asmsc str ) , ?
" loops: : " ( lopsc str ) , ?
" lines: : " ( linsc str ) , ?
|"fbc r$.bas" shell
"done" ?

)

end | main

Charles Pegge · October 04, 2007, 07:07:16 PM

Having tried this system of structured assembler, for the past day or so, I can say it has made a dramatic improvement in coding speed and reliability, possibly a fourfold increase in productivity, comparing favourably with a high level language.

Structured code is almost universally used throughout the programming community across many languages, and we take it for granted, without appreciating what a difference it makes to using line numbers or labels.

This for me has substantially bridged the gap between Assembler and Basic, so my plan is to Assemblerise most of the R$ code. Like hand-blending mayonnaise, this is best done gradually, then my project won't curdle

Kent Sarikaya · October 05, 2007, 06:10:56 AM

I feel like a radio operator picking up signals from an advanced civilization in reading your work Charles. You and some of the other gurus on these forums are in another league. It is fun trying to understand what I can from your guys interesting posts!

Charles Pegge · October 06, 2007, 11:02:19 AM

Well I am not too far away from planet earth.

Some of my ideas on this might be a little strange but I will always try to do some preliminary research to see if they are feasible.

One thing I would like to try now is to explore the idea of a MetaBasic to coin a word, extending some of the work I have done so far, to see if some of the items on the PowerBasic wish list can be satisfied, using a preprocessor script.

Starting with the easy ones like collecting all the structures and prototypes and creating a header file to declare them. This will automate an otherwise tedious chore.

Another task that can be automated is, when combining two programs both containing GLOBAL variables is to extend / mangle / decorate the names to avoid a clash. Effectively creating NameSpaces. But the programs may want to share some of their GLOBALS so do we call these SOLAR variables?

Kent Sarikaya · October 06, 2007, 08:15:23 PM

First your MetaBasic sounds like a great idea!!!

About globals, so their would be globals, but using the program name it would make a namespace global for the globals in that program.

Now, you would also like to have globals that are wider in scope than namespace globals. Solar is fine, but Galactic starts with a g and sounds cool
Galactic Globals. So if a namespace Global was to be universally available to all other apps, would it have galactic_namespace_VariableName hidden naming scheme?

Charles Pegge · October 06, 2007, 09:43:29 PM

Well I thought we could go up one denomination. compared with SOLAR, GALACTIC is awfully big, and might be needed for higher levels of integration. FreeBasic supports COMMON SHARED variables, which allows similar integration of common GLOBAL variables, but this is for tying Programs together at compile time, - sharing variables of the same name between object modules. Since my idea operates at the source code level, I think it is worth making that distinction.

Operating systems, of course don't like to share their variables directly with anything, to prevent erratic applications from trashing the system and invoking the black screen of death.

Kent Sarikaya · October 07, 2007, 08:14:30 PM

Solar is nice, in that, it perhaps captures the idea better. The Solar Variable is the center (focal point) of the programs (planets) that use it. Its value(solar Rays) are available to all the planets.

News:

Structuring Inline Assembler

Charles Pegge

Charles Pegge

Kent Sarikaya

Charles Pegge

Kent Sarikaya

Charles Pegge

Kent Sarikaya