My name is Yuxuan Chen, a Brown University CS master’s student, and I spent this summer as an intern in VoltDB. It’s been a valuable experience for me.
I work on the ecosystem team, and my work mainly focuses on optimizing client APIs, deployment of VoltDB, and troubleshooting. Tasks that relate to different aspects of VoltDB helped me get a better understanding of VoltDB and its inner structures. Based on this, I was able to make contributions to the development of the product.
The first task I was assigned was about the CSV Loader with header line. Previously when loading data from a CSV file into VoltDB, the number of columns in the file was required to be the same as the number of columns in the table, and order had to be consistent. For example, if the header of a table is (ID, NAME, EMAIL), then the order of the columns in the CSV file had to be the same as (ID, NAME, EMAIL). Although this is useful in most cases, sometimes we want to load part of the file into the database, or the order of file columns is not consistent with the table. It would be much more convenient for users if we could provide an option that allows the number and order of columns in the file to be not exactly the same as in the table using a header line of column names. For instance, if the header of the table is (ID, NAME, EMAIL) but the header of the file is (EMAIL, NAME, ID) or (NAME, ID), csvloader can still adjust the order and load the data automatically when users use the –header option. But one thing that developers need to be aware of is that the arguments –header and –procedure are mutually exclusive. More details about the –header option of csvloader can be found here.
This was the first time I made a contribution to the product. And when my branch was merged into the master branch and the feature was added into the product and documented, I felt a huge sense of achievement. I think it would be very hard to get this feeling if I had chosen to do an internship at another company, since not all companies will allow an intern to add a feature directly into the product. VoltDB is unique in providing this opportunity to interns.
Another interesting task I took on was fixing a client thread leak. For clients running VoltDB Client in a web application, say on Tomcat server, some static resources could be leaked. For example, the ‘Reverse DNS lookups’, ‘Estimated Time Updater’, ‘VoltDB Client Reaper Thread’, and ‘Async Logger’ threads could be left running if a web application was re-deployed. We needed to shut these threads down to avoid thread leak. So I needed to monitor and keep track of the number of clients and pause those threads when the number decreased to zero. When the number of clients increased from zero, all these threads would restart.
Dealing with threads and coding on static can be tricky. I needed to take care of side effects of changing some global issues. Fortunately, other team members and the team leader are very enthusiastic about providing all kinds of help, including explaining the call hierarchy, telling me how these threads worked and their purposes. Getting this information really helped me to write my own code based on previous editions.
Many other tasks were valuable to me and enhanced my skills. Not only did writing my own code help me learn more about software development, but also reading others’ code also played an important part. Plenty of design patterns used in the product helped me know how and where to use them.
I really feel lucky I had an internship at VoltDB, and the experience this summer has been really precious to me.