I was asked to join the team running the National Transportation Data Challenge, not as a data scientist, but as a project manager to help keep all the big data people on track and moving forward. It’s an interesting use of my skillset and after a slow start I’ve been devoting more and more of my time to it. After lots of work behind the scenes, we showcased the Challenge to a more general public last week at JupyterCon in New York.
… a more careful understanding of the data might suggest that Amazon is a lot more important than it may seem at first glance, and that the lack of benefit from “savings” in other areas can easily be understood when the data is explored more thoroughly. As a data scientist, that’s the kind of thing I need to be able to do in order to tell a story that is not only compelling, but accurate.
I found TensorFlow initially confusing but then quite comfortable. It’s odd how after programming in a language like Python for a while, it becomes confusing that you have to declare “placeholders” (variables) and constants up-front, then initialize them.
I left last week’s PyData Meetup with more questions than answers. Questions like “why does that neural net I just wrote perform the way it does?” So, with a couple of weeks left until the next project is due, I decided to go back and revisit the second half of the neural networks topic before moving forward.
There is little here that I could not learn on my own. But I find that it’s useful to learn along with others, and the structure that programs like this provide can be useful, so long as it isn’t too expensive. For myself, the structure and ability to discuss issues and problems with others were the key things that made this summer’s effort worth the $600.
The basic idea is to set some goals, share them with anybody who is watching the #SoDS17 tag on twitter, then provide updates by twitter, with more detail in another location. I’ll be using this blog for that purpose