# code_for(0) => 0
# code_for(62) => 10
# code_for(124) => 20
# code_for(1000000) => 4c92
CHARS = ('0'..'9').to_a + ('a'..'z').to_a + ('A'..'Z').to_a
def self.code_for(identifier)
new_code = ""
begin
current = Integer(identifier % CHARS.length)
new_code = CHARS[current] + new_code
identifier = (identifier - current) / CHARS.length
end while identifier != 0
return new_code
end
Refactorings
No refactoring yet !
steved
February 11, 2010, February 11, 2010 18:43, permalink
Fixnum#to_s takes an optional base argument between 2-36
>> 1000000.to_s(16) => "f4240"
bob
February 11, 2010, February 11, 2010 20:11, permalink
@steved: Yea but his base is 62, and includes lowercase and uppercase letters separately
steved
February 12, 2010, February 12, 2010 16:43, permalink
@bob: Yea, but his _real_ requirement is bit.ly type urls. You could debate whether they should be case-sensitive or not, but base 36 will let you handle a lot urls and still have small identifiers.
>> 1_000_000_000.to_s(36) => "gjdgxs" >> "gjdgxs".to_i(36) => 1000000000
steenslag
February 19, 2010, February 19, 2010 21:16, permalink
For ruby 1.8 there is the gem base62. (Version 0.1.1 on github is compatible with ruby 1.9). It encodes and decodes.
require 'base62' puts 123456789.base62_encode # => "8M0kX" puts "8M0kX".base62_decode # => 123456789
mxcl
February 22, 2010, February 22, 2010 11:17, permalink
I experimented with this too, with a very similar base 62 implementation. I found that using String.to_i(base) was 6 times faster than the best I could do. And as pointed out above you still get pretty tiny URLs for a lot of shortens.
Also there is the additional accessibility bonus that people can more easily read the url out loud (without worrying about caps), though I agree this is not really the point, but it's something.
The base62 gem may be worth a look at though.
steenslag
March 6, 2010, March 06, 2010 00:05, permalink
I just found the radix gem ( http://rubyworks.github.com/radix/ ) by Thomas Sawyer, which provides the means of converting to and from any base. It defaults to base: 62! Representation strings are user-definable, upto base 62.
I'm trying to create Bit.ly style URLs identifiers given an input number. These can be completely sequential, and should be able to be generated without duplicates in O(1) time. Basically I don't want to have to check if each code is unique, and they need to be small. I have a really shoddy implementation now that SEEMS to work, but it looks like there should be a better way of accomplishing this.
Here's some sample input I'd think