l_l
17 years ago
Ok, ½.WS is working as it should, but I'm not quite sure why. For some reason it looks like $num_char is not being incremented right
latest #38
Hans
17 years ago
Try using base31 - seriously, I think it's a better and easier alternative. Increments by default - just hand it the row's ID ½.ws/ac
Hans
17 years ago
Just ran a test - the 1000th URL only needs 3 characters.
Hans
17 years ago
The 10000th still stays within 3 characters.
立即下載
l_l
17 years ago
Well, so far, with the tests I've run, it does increment from a- to ba, and ba to bb, so I should be set for the next 4000 URL's.
l_l
17 years ago
What about 100000?
Hans
17 years ago
That takes four characters - but if you use base36 you can have 47,000,000 and use 3 characters
Hans
17 years ago
I can convert that paste to base36 if you want.
l_l
17 years ago
How would that work?
Hans
17 years ago
I'm sorry, it's 4 characters at 47 million
Hans
17 years ago
47 million is how many TinyURL has - and they're using 6 characters right now
Hans
17 years ago
This would be a competitive algorithm
l_l
17 years ago
I'm just kinda confused as how this would work. Is there a wikipedia article or some such I could look at?
Hans
17 years ago
You know how we normally count.. 1 through 10? That's base 10. Base 36 is 1,2,3,4,5,6,7,8,9,a,b,c,d,e,f,g..
Hans
17 years ago
It's a different way of counting, basically - and by converting the base 10 row IDs to base 36, we can save a TON of characters.
Hans
17 years ago
Like I showed before, 47,000,000 converts to a four-character base36 number - 'qycek'.
Hans
17 years ago
Also, this way there are no limits - before, if you got to ---_, it wouldn't know what to do.
l_l
17 years ago
Naw, I fixed that.
l_l
17 years ago
But would the SQL understand the base 36?
Hans
17 years ago
Still - this is a much more efficient algorithm.
Hans
17 years ago
I have to go eat dinner - if you have any more questions, I'd be glad to help once I get back.
l_l
17 years ago
If I was going to do Encoding, why wouldn't I do something like Base 62 ?
Hans
17 years ago
Sure, that works, too. Let me run a few tests with 62..
Hans
17 years ago
With Base62, after 47,000,000 links it has 5 digits
Hans
17 years ago
At 10,000,000 it has 4 characters
l_l
17 years ago
Are you sure this is really going to help me? I get up to 17850625 with 4 chars.
Hans
17 years ago
1mil: 4, 750k: 4, 250k: 4, 100k: 3, 1k: 2, 50: 1
l_l
17 years ago
Yeah, I get up to 274625 with 3 chars.
Hans
17 years ago
With a modified algorithm (base 66) I can get 4 characters with 17850625. This is a much easier way to do it - less code and less CPU.
Hans
17 years ago
With Base66, I can get up to 19.2 million with 4 characters.
Hans
17 years ago
(Base66 is Base64 with these extras: "-_()"
Hans
17 years ago
*Base66 is Base64 with -_(), sorry, typing too fast :-P
l_l
17 years ago
Problem is, I'm not sure if I can put those in a URL.
l_l
17 years ago
Aren't I basically doing base 65 anyway?
Hans
17 years ago
Yes, this is just a cleaner and more simple way - actually converting, without loops and arrays. I think that if this site is going to...
Hans
17 years ago
receive heavy traffic, it's a more viable option.
l_l
17 years ago
So, how easy would it be to make it compatible with the URL's that are already encoded?
Hans
17 years ago
I think you'd have to reset the database - those URLs aren't encoded according to Base65/64 rules, so they wouldn't work.
back to top