top of page
Search
David McGinnis
Mar 17, 20206 min read
Testing a Hive Patch on a Local System
[...] I needed to get a Hive cluster running my code and a Confluent cluster that could output Avro messages in the proper format to test.
301 views0 comments
David McGinnis
Oct 22, 20194 min read
Running Garbage Collection on Your Cluster
At a high level, [CGC] is merely going through the cluster, taking inventory of the data and processes that run on the cluster...
107 views0 comments
David McGinnis
Oct 16, 20196 min read
Writing Environment Agnostic Code
[...] we'll discuss some of the ways we can write environment agnostic code, which can be run on any environment within your enterprise.
1,232 views0 comments
David McGinnis
Sep 29, 20194 min read
YARN Capacity Scheduler and Node Labels Part 2
How do we ensure that GPU jobs run on worker nodes with GPUs without buying expensive GPUs for all of our worker nodes?
839 views0 comments
David McGinnis
Sep 22, 20195 min read
YARN Capacity Scheduler and Node Labels Part 1
I'm going to explore exactly how YARN works with queues, and the various mechanisms available to control how YARN does this.
2,412 views0 comments
David McGinnis
Sep 1, 201910 min read
Machine Learning Solutions: Recommender System Design
With the help of tools like Spark’s MLlib ... [making a recommendation engine] is something that many companies have done and you can too.
103 views0 comments
bottom of page