Doing Your Own Registration Key Process

Donald Darden · May 15, 2008, 11:01:18 PM

Registration keys can be an effective way to ensure that you can track released software. If you make a company's name and user a part of the generation process, you can always determine at what point the software became compromised, then consider action against any offender, By taking the product itself, even the version involved into consideration, one key cannot be used for other products or other versions.

I got interested in Microsoft's license key process because it is self-authenticating. That is, each key can be used to verify itself, to make sure it is valid, before you are allowed to register the product. I don't know how Microsoft does this, but I can emulate the process of self-validation with techniques that I devised. During my efforts to do this, I learned that there are good and sufficient reasons for having a key length of 25 characters rather than a shorter one.

In other words, my keys will not work with Microsoft products (or other products either), but the program is sort of a proof of concept. In fact, it works pretty darn well. I'm thinking of three related products from this:

(1) A custom program for generating self-evaluating keys, all specific to a given product and customer. You can store the data into a validation database and send it to a customer for activating their product.

(2) A DLL module you can call from your program to validate an entered key before accepting registration or permitting the program to run in normal mode

(3) Source code for the above, but each sale is for a custom implementation, so that even if someone else buys the source, they cannot duplicate your results.

What do you guys think?

Donald Darden · May 17, 2008, 02:35:28 AM

While I can't (or won't) discuss the specifics of how I created my registration key generator and verifier, I will discuss the basics that should enable you to create your own. That is, I will try to make you aware of various techniques that can be employed, so that you can try to encode your own.

The first technique is that whatever your process does. it must be fully reversible. Ever give someone driving instructions to and from a destination? For every left turn going, there had to be a right turn coming back, and for every right turn there was a left turn coming back, and these were in the reverse order of occurance. So if the instructions were to turn left, turn left, then turn right, coming back it would be turn left, turn right, then turn right.

The second rule of creating a registration key is limiting the character set to just the letters and digits. That is, A to Z, and 0 to 9. By convention, we only use the upper case letters, and we should let people enter characters in either upper or lower case and convert them to upper case during our process.

The convention is to use a hyphen (-) as a separator between groups, and most registrations keys have five characters in a group. People can manage groupings like this fairly easy. Some people prefer to use spaces instead of hyphens, or let you include spaces as well as hyphens, such as ABCDE - 01234.

The verification part of a registration key means that something either has to be buried in the key, or something has to be excluded. And if you don't know what these are, and include or exclude them as required. the key will fail a validity check. And while I won't discuss how I did it, Let's assume that the registration key must include the word "REAL" in the example I am discussing here.

The next consideration is that we are going to need to use a pseudo-random process to generate combinations of letters and numbers that will make up the bulk of the registration key. For this, we need the RANDOMIZE and RND() commands in BASIC. And we need a seed value that will always be the same for a given key, so that we will always produce the same sequence when we use the RND() function to create our sequence.

The generation process is used to create the key and hide our sequence reference (our seed value for RANDOMIZE) and our verification code ("REAL") somewhere in or key.

The verification process attempt to recover the seed value from a manually entered key, and then calls the generation process to see if it can reproduce the same key as was entered, If they match, then the manually entered key is valid. This match: TRUE or FALSE approach means we can safely ignore the first rule above in this instance, because we are not concerned with the actual contents of the registry key - only that the hidden seed value can be recovered and used to generate an identical key to the manually entered one..

Now, to hide how we actually generate the key with the random process. we want to obscure our tracks even more. And we can use the random generator for this as well. First, we create a string with all the valid characters that can be entered as part of a key like this:

Code Select


  aa$ = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
[/code[
Then when we decide to pick a certain character, we use the RND() value to make the selection, something like this:
[code]
  bb$ = MID$(aa$, RND(1, LEN(aa$)), 1)

We can then build up our registration key in this manner:

Code Select


  FOR a = 1 to 25
    kk$ = kk$ + MID$(aa$, RND(1, LEN(aa$)), 1)
  NEXT

But now we want to get tricky, because we don't want anyone to be able to
figure out how we did this, and even though this key looks secure, it can be made a whole lot tougher to crack. And we still need to get our keyword of "REAL" in there, as well as the seed value we used to generate the apparently random series of characters.

If I just stuck the word "REAL" in the keycode, it would stand out like a sore thumb. So how can we hide it? Well, what if we use it as part of our coding process, so that it is there, but in altered form? For this. Lets first construct a blank version of our registration key:

Code Select


    kk$ = SPACE$(25)      'the number of characters we want for our key
    MID$(kk$,12) = "R"    'arbitrary point for the "R"
    MID$(kk$,7) = "E"      'arbitrary point for the "E"   
    MID$(kk$,19) = "A"    'arbitrary point for the "A" 
    MID$(kk$,3) = "L"      'arbitrary point for the "L"
    FOR a = 1 TO LEN(kk$)
       MID$(kk$,a) = MID$(aa$,(RND(1, LEN(aa$))+ASC(kk$,a)) MOD LEN(aa$) +1
    NEXT

Now that nasty-looking line above is doing all sorts of interesting things in one operation. Working from the inside out, it is using RND() to pick the next random value, which will be returned as a value between 1 and LEN(aa$) inclusive. then it adds the existing character value at the current location in kk$ as determined by the variable a. For most of the string, this will be the value 32, which is the code for a space. But when it hits the stored "L" at the third location, the value recovered will be 76. And when it gets to position 7, the returned value will be 69, and so on as it reaches the next two letters or encounters spaces in the process. Now what it does with this value is add it to the RND() value, then pefrom a MOD instruction against the LEN() of aa$, which holds all the characters we are allowing for inclusion in our registration key, then adds a 1 so that it becomes a pointer into aa$ for the character we want to place in kk$ at this current location.

Well, that got us our "REAL" into the registry key, where it is completely hidden from being detected, but we are still left with the problem of getting our seed value in there as well.

Inserting the seed has to effectively be done last, because it has to be recovered first. One way to do this is to set aside some character spaces where we want to place it, and the nunber of places to set aside depends on how large we want the seed value to be. If it is only going to be from 0 to 35 (or 1 to 36), we can represent it with one character space. If it is going to be from 0 to 1295 (or 1 to 1286), we can use two character spaces. With three character spaces, we can represent seed values of up to 46,656. With four character spaces, we can go up to over 1.6 million seed values.

But we don't need a lot of seed values, because we have something else going for us, which is that we can obscure the use of our random process as well. For instance, every time I call RND(), I use up the next sequential random value that is triggered off that seed value. So how many times do I call RND() as I generate my sequence? Usually just once, but I could call it two, or three, or even more times if I chose, Or even less. Anything other than precisely once per new character, and it will befuddle anyone trying to determine my sequence as a way of reverse engineering to find my original seed value.

How do I manage to call less than once> I can arbitrarily elect to call two characters up from the available character list at once if I choose. I could then have "DE" as a selected par, or "Z0" , or "UV". It's a bit trickier to select properly, but would give you another validation point: The characters in kk$ at the end of the process must be consecutive by pairs. Only you don't want it to be that obvious. So you start off by scrambling the order of the characters in aa$ so that there are no consecutive pairs either alphabetically or numerically. And you can do this again with the RND() process:

Code Select


    aa$="ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
    RANDOMIZE seedvalue      'any vallue you choose 
   FOR a=1 TO LEN(aa$)
     b=RND(1,LEN(aa$))
     bb$=MID$(aa$,a,1)
     MID$(aa$,a)=MID$(aa$,b,1)
     MID$(aa$,b)=bb$
  NEXT

Now you can have consecutive pairs out of aa$ that look random, but which can actually help confirm that the key is valid. You would just skip over the even numbered positions in kk$ when filling it in from aa$ by adding a STEP 2 to the FOR statement. However, you might want to recover a WORD value from kk$ instead of using ASC() which is a byte value, so that the "REAL" characters do not all have to be placed on odd-position boundaries in order to work.

Alright, now how do we embed the seed value? Since we have up to 36 characters that we can encode with (A-Z, 0-9), let's say that we choose to use three characters to represent our seed value. That gives us 36^3 values that we can represent, from 0 to 36,655. Nobody knows how many seeds we allowed for, so they have to try different combinations to even begin to emulate what we are doing here. And as we encode each one, we can place it at some arbitrary position in our registration key. But since we are doing this, we actualy want to change our previous code to only generate 22 characters as described above. To to begin withm kk$ should be set equal to SPACE$(22).

Having generated most of our registry key now, we want to arbitrarily insert our seed value as well

Code Select


    sv = seedvalue        
    bb$ = MID$(aa$, sv MOD LEN(aa$) + 1, 1)
    kk$ = LEFT$(kk$,5) + bb$ + MID$(kk$, 6)
    sv = sv \ LEN(aa$)
    bb$ = MID$(aa$, sv MOD LEN(aa$) + 1, 1)
    kk$ = LEFT$(kk$,20) + bb$ + MID$(kk$, 21)
    sv = sv \ LEN(aa$)
    bb$ = MID$(aa$, sv MOD LEN(aa$) + 1, 1)
    kk$ = LEFT$(kk$,15) + bb$ + MID$(kk$, 16)

One of the additional things we can do is exclude certain characters from aa$
if we choose to. For instance, we could exclude 0 (zero), 8 (eight), Q (the letter after P), and V (the letter after U) because they can be hard to distinguish when not entered as part of a recognizable word. That would reduce aa$ to 32 characters, which is also a power of two number, and can facilitate manipulations with ASM code and bitwise operators. It also reduced the number of input errors by users, and further excludes random sequences from possibly matching.

If we were actually encoding and decoding the content of files, where the data must be restored to original form after all manipulation, then the first rule above comes back into play, and we have to have a decode process that mirrors the encode process exactly. But we can also then employ the full 256 ASCII character set for our aa$, and also employ bitwise operators like XOR and ROTATE operations. These work very well for this purpose because the do not destroy content - with XOR you are merely reversing 1s and 0s, and with ROTATE, you are just displacing the bits by some factor. You can also work at the BYTE, WORD, or DWORD level against string data, which can make it harder to determine what you did and how you did it, and you can employ any processing strategy you want, beginning to end, end to beginning, from either end alternately, from the middle out. This is also a good place to employ RND(), or possibly use an irrational number, digit by digit, to decide what character position to process next.

Any questions?
[/code]

Donald Darden · May 17, 2008, 06:23:08 AM

I figure I ought to come back and add some notes before someone checks out my code above and voices some complaint.

First, the code above would just string 25 characters together. The hyphen (-) separators would only appear when you format the output to print to screen or file/ The hyphens just make the code more readable, but would get in the way during processing. When you accept someone's input of their presumed registration key, you can strip out superflous characters with a command like RETAIN$() or REMOVE$(), for instance:

Code Select


    vv$=RETAIN$(UCASE$(xx$), ANT "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"))

Note also that the UCASE$() will force lower case characters to upper case, so you don't have any case sensitivity issues to deal with eaither.

Now let's look at something that Microsoft does with its registration keys.

This is an actual registration key recovered with a search on the Internet: DDQXW-THQ8M-79V6K-2YFGH-R739Q. I just won't name the vender and product that it works with. If the vender is smart, he will have found this key, just as I did, and will now have blocked it from further activation or updates.

Now there is something interesting about this key. First, if we had a block of these registration codes for the same product, we might detect that certain characters show a high repeat rate within each of the separate groups. Note the DD in the first group. You might also see some characters appear more frequently in several groups, such as the Q, 7, and 9 each appearing twice above. This could be coincidence, or it could indicate a subtle pattern. These patterns can become more evident as you gather more and more keys to compare.

Let's see what might produce such a pattern. We have 36 characters that we can use for our registration process. But suppose we decide we want to limit the number even more for a specific product. We could choose just 10 characters out of the 36 to represent the digits in a decimal number, or 16 characters out of the 36 if we wanted to represent a hexidecimal number.

Now suppose we decide we want to encode a 25 character key, but are only going to create 22 characters using a random process, and use 3 characters to represent the seed value. Sounds familiar, right? But this approach is radically different in certan ways.

This time we use a DOUBLE to work with RND, and string together just the returned digits by first using STR$(), then RETAIN$() to remove everything else.
When we have enough digits, we process them into cc$.

From cc$, we process them bitwise into kk$, 4 bits (0 - 15) at a time. Here is the whole program:

Code Select

GLOBAL a, b, seedvalue AS LONG
GLOBAL aa, bb AS STRING
GLOBAL y, z AS DOUBLE

FUNCTION PBMAIN
  COLOR 15,1
  CLS
  aa$="ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
  FOR a=1 TO LEN(aa$)
    b=RND(1,LEN(aa$))
    bb$=MID$(aa$,a,1)
    MID$(aa$,a,1)=MID$(aa$,b,1)
    MID$(aa$,b)=bb$
  NEXT
  PRINT "codebase = "LEFT$(aa$,16)
  seedvalue = SIN(12.52)*36123 MOD 16^3  'find your own method here
  RANDOMIZE seedvalue
  kk$=SPACE$(22)
  cc$=""
  DO
    bb$=""
    DO
       z=RND            'not an integer returned this time
       bb$=bb$+RETAIN$(STR$(z),ANY "0123456789")
    LOOP WHILE LEN(bb$)<11
    cc$=cc$+HEX$(VAL(bb$))
  LOOP WHILE LEN(cc$)<11
  FOR a=1 TO 11
    b=ASC(cc$,a)
    MID$(kk$,a + a-1) = MID$(aa$,b MOD 16 + 1, 1)+MID$(aa$, b\16 + 1, 1)
  NEXT
  a=seedvalue
  kk$=LEFT$(kk$, 15)+MID$(aa$, a MOD 16 + 1, 1) + MID$(kk$, 16)
  a=a\16
  kk$=LEFT$(kk$, 6) + MID$(aa$, a MOD 16 + 1, 1) + MID$(kk$, 7)
  a=a\16
  kk$=LEFT$(kk$, 21) + MID$(aa$, a MOD 16 + 1, 1) + MID$(kk$, 22)
  PRINT "validkey = " $DQ;
  FOR a= 1 TO LEN(kk$) STEP 5
    PRINT MID$(kk$, a, 5);
    IF a + 5 < LEN(kk$) THEN PRINT " - ";
  NEXT
  PRINT $DQ
  WAITKEY$
END FUNCTION

Now since we decided to resort to bit manipulations, we can add some rotate and XOR operations if we want, but it really makes no difference, as the souce of the individual characters that appeal in the final registration key have been totally obscured by our bazaar methods.

And this is the thing that any would-be code breaker has to contend with, because even this relatively simple method is impossible to figure out as it does not appear to have any logical progression from one character to the next, and our manipulations defeat any attempt to use natural tendancies of RND to accidentally chance on the same outcome.

Going back to the matter of supposed KeyGen solutions to Microsoft and other registered products that are presumed to be in the hands of hackers and software pirates, even to the point of producing non-authentic copies of Windows and Office, I think you will find that either the Microsoft key method is simpler than it appears (or should be), or they have been victumized by insiders that have stolen lists of valid codes or managed to get their hands on the algorythm that Microsoft adopted for its process. Because, like I said elsewhere, these keys will not yield to a brute force approach. Particularly if you somehow tie the seed value to the generated reference code in yet another way.

But that's a different story, and I don't want to give too much away.

Donald Darden · May 18, 2008, 08:42:05 PM

What's wrong with jaywalking or spitting-on-the-sidewalk laws? Nothing, in and of themselves, but laws that are ignored or not enforced are almost worse than no laws at all. They make a mockery of the law in general.

Somewhat the same thing with the Registration Key concept. What is the point of having a registration requirement if it counts for nothing in the long run. So how do you set up a registration process that has some bite to it?

First, you have to presume that a failure to register is often intentional. People don't want to be bugged by needless requirements, and if they don't register and can still use the product, then that is what they will likely do. So either you have to offer inducements to get them to register willingly, or limit what the product can do, or how long it will work if they don't. But in no case can you do anything as extreme as crash their system, wipe out their hard drive, or do anything else of a destructive nature here. There is no justification for doing something like that, and you will only anger the customer base and likely incur legal repercussions as well.

Some of the inducements you can offer are: software support, software updates, membership in a community of users, documentation, example programs, and discount on future upgrades.

Some of the limitations you can impose are: limited time use, demo only mode (cannot create exe file for instance), embedded ads, registration reminders, tagged executables (nags that become part of the finished product), inability to mesh with other products (no DLL support, for instance).

Now the registration key generators discussed earlier represent the idea of creating a key that can be authenticated based simply on the way it is composed. This is very useful in making sure that the key is valid, and since you have to supply the key to the user, you can build your sale up around the steps leading up to issuing the key. This is a common approach. It also works well with managing software downloads and sales online.

A different approach, of providing the registration key at the time of sale, is used by Microsoft and others that typically sell packages over the counter. The key is provided, but the problem here is ensuring that it is used no more often than allowed by the terms of the sale. Actually, at this point we could have a problem with limiting the extended use of any registration key, regardless of how it ends up in someone's hand.

Now keep in mind that computers are problematic in nature, and often, some or all the software will need to be reinstalled. Further, computers are often replaced after a time, and the sale was not conditional on it being used solely on any one PC, so we have to allow for some reinstalls as a matter of course. Further, customers are likely to contact you at some point in the future to get their registration key again, because they have lost it. So you need to keep a record of who has what key that you can refer to after the sale.

What this leads us to is the recognition that we have to track sales and usage to some degree in order to serve our customer base. That means a database will be involved, and if you are smart, it means online registration by the customer so that information about the sale goes right into the database.

But what if the customer does not have a internet connection, or is not currently connected? This is increasingly rare, but it could happen, and some PCs could be on a company network with restrictions on outbound connections. What Microsoft does is allow you 30 days grace before you have to contact them to activate the product. That might work, or you might need a second key that you can supply to the buyer if they contact you by phone instead.

So you might have to consider a temporary key and a permanent key as enablers for your product.
Now as a practical matter, where should the registration key go, and how can we verify how long a product has run?

This is where the Windows Registry sometimes pays off. There is so much in the Registry, and it is so difficult to what all of it does or is used for, that you can easily hide new information there. But someone could technically spot any changes by saving the Registry beforehand, then running a delta against the Registry afterwards, which would spot any changes. Further, some protective software out there inhibit Registry changes unless the user agrees, and some users panic when forced to decide if they should permit a change or not, so that can be a problem.

However, whether you decide to write information to the registry, or whether you write it to a file, the approach being described here can be implemented either way.

(1) The program installer asks for the registration key. If valid, it writes this to an entry in the Registry or to a file stored in a place not related to the product. (If stored in a product-related place, it could easily be copied to another PC without a valid install & registration process). The stored registration key is adapted as it is stored, to prevent it being used for a different install by simply copying it from the registry or from the file.

(2) The installer also makes a mark somewhere it indicate that it has written out this info, so that it knows it has run, and the program knows that it has run. The installer then may call home to register if it is able to, or tells the user how to complete the registration process on his own.

(3) The program can then start, checks to see if the installer has previously run, then checks to see if the registration key is where it is suppose to be. If not, it runs in demo or limited mode. This may also involve checking to see how long the program has been in use, or how many times it has been started.

Lapse time checks with the program can be tricky, because a knowledgeable user is going to suspect that you will either use the folder or program create time for counting from, or use a Registry entry to record the date when started. He might also consider that you have written it to a file in the program folder. In an effort to beat you, the user may try to re-timestamp the folders and files to see if he can get more time out of the time-limited version. I've even seen posts were people have reset the PC date to keep some software going past its expiration date. so if you want to manage time limited software, you might want to see if you can validate the date via the internet rather than accept blindly what the DATE function returns.

On the server end, you probably need some personal information about the user, typically the company and person's name. But if the company owns it, maybe just the company name, or if the person is buying it for himself, then exclude the company name. Call it an effort to identify the responsible party, so that if the Registration code gets into the wild, you know who you are dealing with. More important, perhaps, the responsible party knows that they have been name-identified with that code. That might mean greater care in administering it. An address and phone number would help here as well, and an email address.

You might also want a re-registration counter field. If it exceeds some arbitrary value, you might want an alert that it may be subject to abuse. You may also want to disable the software at that point, or put it back to limited mode until the problem is rectified or the matter explained.

You are probably going to use the registration key as a means of crossing to the registered user and product, possibly the version number, and you may want to track updates as well. If you track updates, you can send email notices to people that have not updated to let them know when they are available.

I guess that is all I have to say for the moment, Anybody else got some ideas to share in this regard?

Marco Pontello · May 19, 2008, 09:10:43 PM

Just a note.
If in need of pseudo-random numbers for something like that, I think that using the proprietary function provided by a specific/particular language implementation isn't the right thing to do: this may lead to big troubles if, for example, something change on a later version of that language, or a need come up to use a different language / system.

Bye!

Donald Darden · May 21, 2008, 08:03:26 AM

I can see where that might be a problem down the road. There are some hardware implementations of the pseudo-random number generator (which is the correct way of referring to a random number generator, commonly abbreviated RNG, but more correctly ORNG), And there are system implementations, some programming languages use their own, and one can always do some research and adopt one of your own choosing.

If you are going to use one that you choose, you don't have to worry about which language or system you are relying on in the future. And you can judge the merits of your chosen method and go with whichever one seems to be the most promising. For instance, some PRNGs appear slightly more random than others, and some offer a much larger return value and longer period before they begin to repeat. Speed of execution is not a factor because you probably only occasionally need to generate a new random key.

So that is just something else to take into consideration.

News:

Doing Your Own Registration Key Process

Donald Darden

Donald Darden

Donald Darden

Donald Darden

Marco Pontello

Donald Darden