Try using base31 - seriously, I think it's a better and easier alternative. Increments by default - just hand it the row's ID
½.ws/ac
Just ran a test - the 1000th URL only needs 3 characters.
The 10000th still stays within 3 characters.
Well, so far, with the tests I've run, it does increment from a- to ba, and ba to bb, so I should be set for the next 4000 URL's.
That takes four characters - but if you use base36 you can have 47,000,000 and use 3 characters
I can convert that paste to base36 if you want.
I'm sorry, it's 4 characters at 47 million
47 million is how many TinyURL has - and they're using 6 characters right now
This would be a competitive algorithm
I'm just kinda confused as how this would work. Is there a wikipedia article or some such I could look at?
You know how we normally count.. 1 through 10? That's base 10. Base 36 is 1,2,3,4,5,6,7,8,9,a,b,c,d,e,f,g..
It's a different way of counting, basically - and by converting the base 10 row IDs to base 36, we can save a TON of characters.
Like I showed before, 47,000,000 converts to a four-character base36 number - 'qycek'.
Also, this way there are no limits - before, if you got to ---_, it wouldn't know what to do.
But would the SQL understand the base 36?
Still - this is a much more efficient algorithm.
I have to go eat dinner - if you have any more questions, I'd be glad to help once I get back.
If I was going to do Encoding, why wouldn't I do something like
Base 62 ?
Sure, that works, too. Let me run a few tests with 62..
With Base62, after 47,000,000 links it has 5 digits
At 10,000,000 it has 4 characters
Are you sure this is really going to help me? I get up to 17850625 with 4 chars.
1mil: 4, 750k: 4, 250k: 4, 100k: 3, 1k: 2, 50: 1
Yeah, I get up to 274625 with 3 chars.
With a modified algorithm (base 66) I can get 4 characters with 17850625. This is a much easier way to do it - less code and less CPU.
With Base66, I can get up to 19.2 million with 4 characters.
(Base66 is Base64 with these extras: "-_()"
*Base66 is Base64 with -_(), sorry, typing too fast
Problem is, I'm not sure if I can put those in a URL.
Aren't I basically doing base 65 anyway?
Yes, this is just a cleaner and more simple way - actually converting, without loops and arrays. I think that if this site is going to...
receive heavy traffic, it's a more viable option.
So, how easy would it be to make it compatible with the URL's that are already encoded?
I think you'd have to reset the database - those URLs aren't encoded according to Base65/64 rules, so they wouldn't work.