February 2012
9 posts
Notes on Linked Lists
In an effort to improve my interview questions and to review my own knowledge of the basics, I have started studying basic data structures and algorithms. I have actually done this every few years or so just keep it fresh. Since I use lists and their variants everyday, I thought this might be a good place to start the review.
The interesting thing about linked lists, is that for 99% of...
11,000 plays with no paper trail
This recording has gotten more than 11000 plays. And neither I nor SC have any idea why. So much for big data analytics.
Programmers are athletes
As my career advances on, something is becoming more and more obvious. I am surrounded by people younger than me. At 35, I am not middle-aged, I am an old man. In my corporate environment, there is a preference to hire very young and right out of school. These kids are bright, hungry, and have almost no opinion about anything. Perfect to mold into the company’s image. The older...
3 tags
Logistic Regression
-NOTES, NEEDS FINISHING-
Basic classification algorithm used for prediction of the probability of occurrence of an event by fitting data to a logit function logistic curve.
Hypothesis Representation
In logistic regression, we want the hypothesis to be $$0 \le h_\theta(x) \le 1$$
For this, we use the Sigmoid or Logistic Function
$$\begin{eqnarray} g(z) &=& { \frac{1}{1 + e^{-z}} }...
Example MapReduce classes in Scala →
Google's Gson List deserialization and variable...
I spent much of the day today hung on two very simple and unrelated problems while running an HBase MapReduce job whose mapper collects data stored as JSON from HBase and writes it to the context as Text (serialized JSON). The reducer deserializes the data and does some long running calculations on it.
The primary reason these calculations must be run in the reducer is because the HBase scanner...
HBase and submitting remote MapReduce jobs to...
I’ve been working with the Hadoop ecosystem quite a bit over the last 3 weeks. For as much documentation as exists on the topic, most of the books, online tutorials, and the like cover only the basic use cases. Many of these use cases revolve around using shell scripts to call command line tools (tiny apps with a main method that invoke other things).
By following the basic tutorials, it is...