goatarchitecture:
“…Giant parabolic slides, or Parabelrutsche located in the Mathematik/Informatik building of the Technical University of Munich, Garching campus. The preferred method for riding them appears to be on carpet squares…”
WOW this looks so AWESOME
(via fuckyeahmath)
Filed under math
Write a draft really fast using my own opinions, then find concrete details to support my own opinions. Finding ideas should be faster than processing tons of different ideas and choosing sources based on that. 280 words so far.
Are there any (free) Twitter clients that display only interesting tweets? Like you tell it which tweets are interesting and then it’ll give an “interesting” stream and also the “full” stream? There seem to be a lot of tweets that I get that I don’t even read… If there isn’t a twitter client that does this I’ll just make my own (with CLIENT SIDE BAYESIAN CLASSIFIERS!). Yeah, I’ll just make my own.
</random-thought>
A mini-Reddit effect from /r/programming, can /r/pics do any better?
Recursion is when something is defined by referencing it’s self. Google has an amazing explanation for recursion, just Google search “recursion” (no need to click on any links even! it’s right there in Google).
Tail call recursion is when something references it’s self at the end. It’s hard to explain so I’ll just explain by example. Lack of proper tail call recursion leads to major slow downs as well. I had heard that but never put it to the test so I wrote a couple of programs to show how detrimental improper tail call recursion can be.
The simplest example is the factorial function, it has a nice recursive definition. (Note to self, add MathML magic to this Tumblr, until then take images from Wikipedia).

I wanted to see how bad implementing this naively would be so I wrote two programs in F# to find the factorial of 1000. (F# has a BigInteger class built in so that’s awesome).
#light
let rec badFact n =
if n <= 1I then 1I
else n * (badFact n-1I);;
(* bad because no tail call optimize *)
let bigI = 1000I;;printfn "%A! has %A digits" bigI (string (badFact bigI)).Length;;
And I wrote another version that can be tail call optimized.
#light
let rec goodFact n prod =
if n <= 1I then prod
else goodFact (n-1I) (prod*n);;
let bigI = 1000I;;
printfn "%A! has %A digits" bigI (string (goodFact bigI 1I)).Length;;
The difference is subtle but the performance difference is significant. As it turns out, the naive implementation stackoverflows. The good one finishes in less than a second.
I thought this may be because it’s running on .NET which isn’t exactly designed for lots of functional things (it’s still good though, as the proper implementation ran very fast). I decided to write these in Haskell to see Haskell with it’s trampoline-like calling system be able to handle the bad version.
The Haskell version successfully ran the bad version but so quite slowly, as expected. On my Macbook it ran in 0.413 seconds. That may seem fast but the better one only required 0.007 seconds. This may be a reason why people sometimes think Haskell’s performance is unpredictable, improper recursive calls that, if not for Haskell’s beastly function calling mechanism, would cause stack overflows.
Why?
I should probably explain why the “bad version” is bad and why the good version runs circles around it. This has to do with how functions are called. Down at the assembler level there is a “stack” used for function calling. When in one function and calling another, the function “pushes” it’s location onto the stack so that the function that it calls knows where to (jmp) return to. When the program does “n * (badFact (n-1))” it has to call badFact and then return and finally multiply with “n”. In the goodFact it returns the result of goodFact directly, no multiplying by “n” at the end. This allows the compiler to optimize and turn this into a loop, i.e. no need for the calling stack. This is why the good version doesn’t stackoverflow and the bad one does. (Also Haskell’s calling system might not overflow as fast but it certainly adds overhead to the calculation, making the bad version very slow).
Filed under CompSci
I think this is largely the reason Haskell is hardly used and other languages like OCaml and Lisp are more widespread.
Functional programming languages aren’t very widespread, but of them, Haskell is hardly used outside of academia at all. I was thinking today why it’s not and why Lisp and ML based languages are (to some degree). Last night I was reading about a couple of Financial firms that use OCaml (an ML variant) and I realized why they might prefer that over Haskell, Haskell has lazy evaluation whereas OCaml has eager evaluation.
I know Haskell to some degree and just starting to learn F# (Microsoft’s version of OCaml, largely the same syntax plus benefits of .NET). Haskell is really eloquent but some things are just hard to do with it. Haskell is good at doing math stuff but I wouldn’t consider making a bot or webservice with it. I think the reason it becomes hard is because of it’s laziness.
Laziness is Haskell’s two sided sword, it’s a very powerful language feature (i.e. you can represent infinite sequences and lists without bounds) but it also makes certain things convoluted and hard to do. ML based languages seem mostly “pure” (purely functional) but they lose their “purity” to some extent to make things easier. I really need more experience to make a fair opinion, but I don’t think making infinite sequences happens too frequently (except maybe in math software).
Filed under CompSci
I’ve done a small “survey” of publicly accessible tweets recently. I used Twitter’s Streaming API to download a small stream of tweets (about 1-2% of all tweets during that time) for a little over 3 hours the other day. The result was roughly 285 MB of JSON data, each tweet object is delimited by a newline.
I wrote a small Python program to process these tweets and give some stats about them. The program runs in about 15 seconds on my 2008 Macbook but I haven’t determined if the slowness is due to reading the file or processing the data, my guess is reading the file but I’ll need to time that later.
The program reads the file line by line and tries to parse the JSON encoded tweet. Then it determines if the tweet has Geo data, if the tweet is a 140 char tweet, and it also creates a “word” distribution of the tweet. I say “word” but really it just splits the text for every space, so that means words include links, hashtags, usernames, and punctuation i.e. “this.” is different than “this”.
The program was meant to just give a simple look on the data and here are the results:
Tweets: 139591
GEO: 996
Characters: 9228541
Average Length: 66.1112894098
140s: 4290
140%: 3.07326403565
Unique 'words': 338538
Total hapaxes: 278362
Filed under CompSci