Mar 232016
 

So I ran into an interesting issue a while back doing something on a project at work. I was doing something that involved interacting with files on S3, and had updated the AWS libraries my application was using, but I was running into errors every time I tried to do anything that touched S3. The issue wound up being a dependency conflict between my code and an S3 wrapper utility we wrote around the AWS SDK. While the main project’s AWS SDK was up-to-date, the utility had older versions that were causing the errors. Although I had used Amazon’s BOM and thought that made sure that my AWS libraries were all set to a particular version, the reality was that assurance didn’t spread to other libraries my project pulled in. The solution was simple enough – update the AWS SDK versions in our other internal library, but that did lead to the question, how does one manage dependencies between projects?

Continue reading »

 Posted by at 10:56 PM
Jan 312016
 

I recently finished up some data aggregation work involving Apache’s Hive, and as a means of getting some MapReduce work off the ground quickly, it’s pretty good. Hive’s goal is to abstract away MapReduce behind basic SQL queries, and on that front it succeeds. The fact that I’m ultimately doing MapReduce jobs is hidden except for what would look like a minor quirk if I didn’t know that was what was going on under the covers. That said, there were a couple of things I noticed both during development and with running the jobs on Amazon’s EMR service that are worth noting.

Continue reading »

 Posted by at 10:05 PM
Sep 072015
 

1 of the last things I worked on in my previous job was converting a cron-generated set of emails about server SLA stats into a web service. Early on into this process, I looked at the ELK stack, discussed it with my team lead, and the two of us ultimately came to the conclusion that going that route would be overkill, and that instead we should just write our own utility. That, boys and girls, was a stupid, stupid mistake. Instead of using an existing, free, open-source, project, we decided to re-invent a wheel that other people had already invented, debugged, and was working well for a lot of projects, including bigger ones than what we were intending to do. Continue reading »

 Posted by at 10:51 AM
Feb 072015
 

Pretty much everybody in the developed world (and most of the developing world) interacts with Facebook as an application. Not many people have to actually deal with Facebook from an API level. I’ve spent a few months writing some code that tries to perform a few simple tasks on Facebook, and it’s been rough. Here’s a random collection of things I’ve learned, gotchas, and other points worth noting in the process. As a brief point of reference, most of my interaction with Facebook comes from the server-side code, written in Java, although I’ve played around with Facebook’s JavaScript SDK as well.

Continue reading »

 Posted by at 3:31 PM
Nov 072014
 

1 of the last projects I worked on at my previous job involved aggregating, storing, and querying log data into and from Elasticsearch (yes, I know that Logstash does that – and in reality I should have gone that route). That, along with some lookups on the data outside of the code, gave me a chance to start playing with Elasticsearch. After my brief experience with it, I can tell you there’s a lot of power in Elasticsesarch, but it’s going to take you a surprisingly longer to figure out how to tap it than you would expect. Continue reading »

 Posted by at 12:28 PM