Interview with a SWIM Developer: Building an Augmented Reality Rover
by SWIM Team, on Aug 8, 2018 1:18:01 PM
Photo of the augmented reality (AR) controlled rover built by SWIM.AI frontend developer, Scott.
We recently sat down with Scott, a senior frontend developer at SWIM.AI to discuss one of his current projects, an augmented reality (AR) controlled rover. Scott shared his thoughts on the challenges of using some open source databases in robotics, and why he decided to use SWIM instead. The following conversation covers the importance of real-time data for robotics and control applications, and how implementing SWIM has led Scott to explore further autonomous capabilities for his rover.
Here's what Scott had to say:
Interviewer: So how did you get the idea to build an autonomous Rover?
Scott: Well, I’ve been doing robotics for about 10 years as a hobby. This idea started with an old RC car that I realized I could attach a Raspberry Pi to, and the ideas just grew from there. My goal now is to build a semi-autonomous rover that can be controlled by a combination of interactive software and programming logic.
Interviewer: Describe the Rover and hardware components you’ve integrated together here.
Scott: As I mentioned, the frame comes from an 1/10 scale RC car. On the car, I’ve attached a variety of sensors that includes five distance sensors - four up front and one in the back - and two webcams mounted on a “pan & tilt” head on the front. That pan & tilt head is important because the webcams also allow the rover to look around and measure it’s distance from objects.
On the Rover is also a Raspberry Pi, Arduino, servo controller, and breadboard. On the breadboard, there’s a 3-channel accelerometer, magnetometer, and a compass that allows the Rover to infer movement, direction, and orientation. All those sensors feed into the Arduino that converts the inputs from analog to digital values and then pipes those values over a serial port. On the Raspberry Pi, NodeJS is running with several processes for reading and writing to several serial channels to facilitate communicate with both the Arduino and a servo controller. There’s a Node class called sensors that receives all the sensor values off the serial connection and sends the data directly into SWIM. This provides a basic data bridge.
A closeup of the hardware components used on the augmented reality controlled rover.
Interviewer: And the Rover has some semi-autonomous capabilities?
Scott: Yes, I’ve got it hooked up to Gear VR. So if you have the page running on your cell phone inside a Gear VR headset, it’ll show you the video streams from each camera. The sensor data from your phone is also being sent to the Rover in order to move the pan & tilt in sync with your phone, and allow you to look around from the viewpoint of the robot’s head. Because we have two webcams, you’ll get a full 3d stereoscopic view.
Interviewer: When you first tried building this, you were using MongoDB?
Scott: Originally, MongoDB ran where SWIM runs today on the Raspberry Pi. I initially used a database because I was trying to make a central place where all the various sensor and equipment connected on the Rover could share data. Also, because I wanted to use the data from heads-up display on Gear VR, there were external devices that would be sending data to the Rover. I needed a way to maintain everything’s state and intent.
Interviewer: So what prompted the change?
Scott: Well, the first problem I ran into was with the sensor data. Because I was sending so much sensor data so quickly while trying to read it, Mongo would essentially meltdown. I would either get locks because I’m trying to read/write at the same time, or the whole database would crash… which sucks...
The second problem was creating and managing database. With Mongo, or most any DB, you have to have a script that creates the database when starting from nothing. Initially, I forgot to write those scripts as I was figuring out the first data structure. At one point my DB got corrupted due to user error and because I didn't have the DB creation scripts I had faced with recreating DB by hand again. This also meant that I could not easily wipe and recreate the data structure during testing without first writing that DB creation script which frankly I did not want to do. With SWIM, my data structure is created automatically by each SWIM service I create. This makes managing the data for the rover worlds easier, which then means I can focus on using the data instead of managing it.
The third problem is related to figuring out the data structure. It’s so much easier with SWIM. Once a SWIM service starts, it creates its own lanes based on the sensor data it’s collecting. So the sensor data structure mirrors whatever I’m sending it and I don’t have to worry about changing tables or indexes or anything. It is what I send it. And so instead of having to worry about how to structure data so I can access it quickly, I’m interested in what data is important and how to surface it in the right place for the right thing.
Needing to know the right data tables, and have the right queries set up, and needing to know all the sensor channels before you even get started - it’s limiting. Now I can throw new sensors on the Rover and it just works. Switching over gave me the real-time data I really wanted, without melting down MongoDB or the Raspberry Pi.
Interviewer: How long did you spend trying to get everything to work with MongoDB?
Scott: A month or two, and it never really worked well. Even when things worked i had problems with latency and CPU usage on the Pi.
Interviewer: And how long did it take to get everything working with SWIM?
Scott: It didn’t take very long. Probably a couple of weeks to completely remove the Mongo stuff and re-implement that layer using SWIM, which got me back to not just the level that I was, but to a better point from where I left off with Mongo. Latency was not longer an issue, SWIM uses very little CPU leaving more for Node, and i had more data available to external services than before. Like I said, when I tried to use Mongo, it didn’t work well. Now it actually runs really well, which is pretty cool.
Interviewer: But since you were using a database, wasn’t storing the long-term history of that data important?
Scott: Well, no. I didn’t do any long-term historical analysis because it was hard enough getting the real-time aspects working. But really, for what I want to do, long-term history of the sensor data isn’t important. What’s more important is the last few seconds of sensor data (for troubleshooting). Getting that into and out of Mongo was hugely difficult. In Mongo i never got to the point where i could store historical data because I was focused on getting real time data working first.
With SWIM historical data was trivial to add to every sensor. On the sensor service when a new sensor lane is created, a history lane is also automatically created and populated.. This makes it really easy to just set up a history lane that keeps track of everything for me, and I can easily control the size of that history lane to suit what i need. For the Rover I can get the last few seconds of sensor data really easily and surface it up in a graph using Swim UI components. That little clip of data is really useful to see what just changed - in case I look away, I can look back and realize that something interesting just happened. It’s all available on a simple web page built with the SWIM UI Library, where I have a gauge and a time-series chart with the last few seconds of data for every sensor channel. There’s 18 sensor channels all on the same page, updating in real time for the data - which is really cool to sit, watch, and think about the behavior I want.
Interviewer: Tell me about the SWIM Rover application running on the Raspberry Pi.
Scott: It’s really just two SWIM Services. One is a Sensor Service that collects all the sensor data being sent to the Pi. The other is a Rover Service that incorporates the logic. It will always have an intent, even if it’s just idle. It can change intents and state of the Rover to drive particular behaviors.
Closeup of the pan & tilt head which provides visual and sensory information to the rover.
Interviewer: That sounds pretty simple. How do the SWIM Services interact with the pan & tilt head?
Scott: So from the phone’s webpage, we send commands into SWIM of what the phone’s tilt and header using the phone’s magnetometer. In the Sensor Service, there’s a lane being updated with those values. Then in Node on the Rover there is a process which looks at those same value lanes and sends commands to the servo controller to move the tilt/pan head.. What SWIM allows me to do also is change which lane is controlling the servo’s as well. So I can have the RC controller changing the steering and drive motor, or just controlling the drive motor. It’s just a matter of which SWIM lanes I point them to.
Interviewer: You mentioned wanting to run some advanced analytics running in Python. What does that architecture look like?
Scott: So the next thing I’m looking at incorporating is object detection off the webcams using TensorFlow and OpenCV. This will be a separate process either running on either an Nvidia GPU board or Amazon EC2 Elastic GPU. Another external process could also be reading the video feed to do facial recognition using OpenCV and Haar cascades on another GPU accelerated EC2 instance.
Interviewer: It could either be on the Rover or in the cloud?
Scott: With SWIM, it’s wherever the Service is running, either on the hardware next to the Rover or in the cloud. We’re just running on the hardware that’s dedicated for doing that type of data processing, and if it’s object detection, we’ll get the acceleration we need from GPU’s and you can send that data via command line back into SWIM on the Rover. This will probably be a new SWIM Service, so that I can add some intelligence about what the Rover sees and logic such as “I see the ball and I’ll follow the ball”. Or we could assign distributed tasks. The new Service could communicate “I see an object, I’ve detected an object, here’s the coordinates for that object,” and send that info to the Rover. Then the Rover will respond with “OK, I have an object, here’s the decision in my head, let me do the math and figure out where I need to move my head so that I’m still looking directly at the object.” That logic would happen in real-time like everything else in SWIM, so if you’re moving the object around, the Rover head would continue to track the object. What’s key is that the Rover head doesn’t know it’s doing that because of object detection, the head just knows it’s doing that because of sensor data changes in SWIM.
A view from the front of the augmented reality controlled rover.
Interviewer: Right, so you’ve now simplified all the programming logic for basic intelligence.
Scott: Yep. And with that pattern, I’m now able to bring in all these extra SWIM Services, whether they exist in the cloud or wherever, and add functionality to the Rover without having to build everything on the Rover itself - which is unrealistic for CPU or GPU intensive features.
Interviewer: What are some of those features?
Scott: Along with the visual services i described before, one of the more interesting Services I was thinking of adding is an Alexa-style voice command, which would also require some back-and-forth between SWIM Services. Another Service that would be cool is pathfinding, which would start bringing some real autonomy to the Rover…