National Transportation Data Challenge @ JupyterCon
Been a while since I’ve written much. I was asked to join the team running the National Transportation Data Challenge, not as a data scientist, but as a project manager to help keep all the big data people on track and moving forward. It’s an interesting use of my skillset and after a slow start I’ve been devoting more and more of my time to it. That’s part of the reason there haven’t been any Deep Learning updates.
The challenge is sponsored by the US Big Data Regional Innovation Hubs, who are funded by the National Science Foundation. In fact, much of the work is done by volunteers, academic researchers, corporate partners and others who are simply interested in the topic, forming “spokes” around the various hubs. As with any such initiative, there are a variety of intersecting (and sometimes competing) interests involved, so managing things and keeping them going forward can be challenging. In the past two weeks we settled on our final goal: a demonstration event in Washington D.C. on November 9th, with guests from the NSF, Department of Transportation, Department of Homeland Security, other government agencies, and corporate/academic partners.
After lots of work behind the scenes, we showcased the Challenge to a more general public this past week. The chosen venue was JupyterCon in New York, and since I’m always up for a visit to my hometown (and mom is always happy to have some things fixed around the house!), I joined the contributors from Big Data Innovation Hubs around the country and other volunteers in Manhattan last week for a couple of days.
The conferences I’ve been to recently have been somewhere between “huge” and “massive,” often with thousands of people in attendance, huge exhibit halls, and sometimes a dozen simultaneous presentations going on. Conferences like these (SCaLE is my favorite), can be a lot of fun, but also overwhelming. JupyterCon was pretty small in comparison to those, which made for a less frantic experience and more in-depth conversations with the people I did speak with. Also, the focus was naturally much tighter. The sponsor hall was small, only a couple of t-shirts were aquired and the venue at the New York Hilton was easily manageable. Happily, the Manhattan weather also cooperated.
As I was only able to stay for the first two days, my content options were limited but generally engaging. I heard a couple of talks, mostly dealing with data visualization and interactivity in Jupyter which in addition to neural nets are hot topics for me right now. The opening night poster session that the Challenge participated in had a good variety of presenters, and was a nice forum for networking with others who have done interesting work.
Leaving there, I found myself looking forward to other smaller events I plan to participate in over the coming months, in the US, Canada and New Zealand. Kiwi friends look out, here I come!
Data Challenge at the Poster Session
For us, this was a public launch weekend. While work has been going on for a while, we just got the challenge sign-up form out there, and have just started promoting it. The O’Reilly people were nice enough to offer us both a space at the poster session and a small table on which to run demos.
The demos, which were well-received, were sponsored by DataScience Inc., and in addition to showcasing the use of Jupyter and Python, also showcased the capabilities of the DataScience platform, which includes access to a variety of donated computing resources, datasets, links to github repositories containing additional code, and a very neat unified platform on which the resulting applications and analysis can be easily deployed for use. A sampling of demos are available to the public, with far more available to anybody who signs up for the challenge.
We wrapped up with a good Indian dinner for the crew who remained to the very end. It made for a nice end to our first public foray as the Transportation Data Challenge team.
Data Challenge at Lunch
The following day, we hosted two tables at lunch for anybody interested in the Challenge or the activities of the Hubs in general. Had interesting discussions with the Data Science lead from General Assembly, as well as other supporters from fast.ai, the University of Arizona, Internet2 and other organizations and individuals interested in our work.
I found it interesting how many of the fields I’ve explored over the years in both professional and personal capacities seem to overlap once you start looking at the core data we all need and want to analyze. The Challenge, of course, is approaching things from a transportation perspective, but I found it interesting that many of the things we are looking at overlap with things people are looking at when evaluating sustainability initiatives, smart campus planning, and disaster response. In fact, a follow-up initiative has been to expand the scope of the challenge to looking at the impacts of Hurricane Harvey, which struck the Gulf Coast just a few hours after our lunch.
As a remote project manager, it’s always good to meet as many of the team members as possible in person. While not all the participants, or even all the leaders, were able to make it to this event, I met a number of the key people and have a better feel for how everybody will interact. That’s a good thing for me, and for the Challenge overall. It was also a lot of fun and very informative. I’m looking forward to the ongoing activities.
Lots of organizational stuff is supposed to happen in the coming weeks. We need to choose a location for the data challenge final showcase in DC. The team needs to come together and plan the event. In parallel, we have groups working on various aspects of transportation data, using a variety of contributed resources. We have two hackathons planned, and we may present the outcomes of at least one of the at Open Data Science West, in San Francisco the week before our DC event. Lots to do to make this happen.
Disclaimer: Opinions and observations are my own. Nothing in this blog should be understood as reflecting the views or policies of the National Science Foundation, the Regional Big Data Innovation Hubs, or any other organizations or individuals associated with the National Transportation Data Challenge.