Sebastian Scherer from CMU’s Airlab gives us a behind-the-scenes demo at ICRA of their Autonomous Flight Control AI. Their approach aims to cooperate with human pilots and act the way they would.
The team took this approach to create a more natural, less intrusive process for co-habiting human and AI pilots at a single airport. They describe it as a Turing Test, where ideally the human pilot will be unable to distinguish an AI from a person operating the plane.
Their communication system works parallel with a 6-camera hardware package based on the Nvidia AGX Dev Kit. This kit measures the angular speed of objects flying across the videos.
In this world, high angular velocity means low risk — since the object is flying at a fast speed perpendicular to the camera plane.
Low angular velocity indicates high risk since the object could be flying directly at the plane, headed for a collision.
Sebastian Scherer:[00:00:00] So, so what do you see here? So we have the, uh, pilot, which is, Ian, and then, uh, this is essentially the AI and then we’ll set them up so that we have a situation where, uh, the human and the AI are going to converge at an airport. And then, uh, uh, um, so you’ll see what the AI is doing here and a view of the station, and then, uh, so we just restart the whole setup
and then, uh, Ian’s basically has a headset on where you can talk into for the voice recognition
Ok, now its reset. So you can see here, the human and AI are converging then this is what the algorithm is thinking, what a [00:01:00] human is gonna be doing and decisions, the AI’s exploring. And it’s trying to find the, find the best one. Can you do an announcement?
Pilot: Butler traffic sky hawk five sigma three, three miles from the south landing runway.
Sebastian Scherer:So he said he’s going to runway eight. And so once the system has, uh, understood this, then, um, now it is inferring what, uh, what, uh, Ian might be doing, which is running a runway eight. Um, and then, uh, the AI is gonna make it’s traffic call at some point, once it gets close enough.
So, so this is an an… just to clarify… so this is an [00:02:00] uncontrolled airport. So the way it works is pilots basically just announce where they are and what they want to do. Right. So there’s no tower or ATC it’s basically a, you know, it’s like, uh, you know, like in a..
AI: Butler SkyHawk traffic 7, 3, 7, 3 miles south 2,700 feet and bound runway “two six” low approach bundle.
Sebastian Scherer:I told her, AI said, okay, I’m going to come into runway “2, 6”, to a low approach, and Ian’s going to do another call.
Pilot: Butler traffic Skyhawk, 5 7, 3, 2 miles to the south landing runway two, six
Sebastian Scherer:And so once it is passed it, it should switch to the other runway. So we don’t see the light change around. So now this is more of a standard traffic pattern. And then both of them are landing at, uh, at “two six”. So he had the AI is kind of cutting in front of him because he thinks he is, uh, uh, he’s uh, he’s going to be [00:03:00] faster than, the human pilot.
Yeah, this is fine. As long as the safety is right. And so their behavior is done based on, uh, a lot of data that we collected at the airports to figure out how, uh, how people actually flying at these airports. And what we want to achieve is that the AI flies the same way that humans fly.
AI: Butler Traffic Skyhawk 7-3-7 turning down “two six” butler.
Sebastian Scherer:So the idea is that. You want to have this seamless operation, right. Instead of having like a UAV come in and it completely ignores, and it’s like, okay, I’m going to take my right and not care about anybody house flies, uh, in, uh, in a.
Abate:What type of sensors are they using to communicate with each other?
Sebastian Scherer:So, so both aircraft essentially in a real system would have radios.
And then, uh, they also both have in this sense, in this case, ADSB, uh, so they can see each other, but [00:04:00] from a legal requirement, they would have to have this visual detecting avoid system in a sense to know, okay, I see this aircraft in front of me. I can separate from them.
Crowd Speaker: So this case AI, does it do the visual landing or how does it work?
Sebastian Scherer:Uh, so in this case we only doing a low approach, but, uh, so, so there’s, so there’s, uh, the learning was not a big focus on this, right? So just comes in to find the approach, but doesn’t do the actual landing.
So it doesn’t passive landing?
So the AI is essentially exploring various, uh, options. What would the human do?
What would AI do and they’re basically playing this game, with each other and trying to figure out what is the best, best choice that is safe. Right? And then the other thing is that we’re not really showing here is, but there’s, uh, a formal guarantee that they will never collide each other that we’re also developing,[00:05:00]
Abate:has this been tested in like really busy sort of multiple airlines coming in around the same time areas?
Sebastian Scherer:So that’s something that we are working on. Yes. And then we have the data. So this particular airport is pretty busy. There’s sometimes up to 6 aircraft. So we have data from lots of encounters of various aircraft, but it’s not something you have like tested in a real set up.
Abate:Is that data data you have to capture yourself or is that something that’s readily available from airports?
Sebastian Scherer:Uh, so it’s something we capture, I would say it’s not available at the resolution we need it online. Again… I don’t know… flight tracker websites. They don’t capture at the frequency and low altitude that we need it.
Abate:So you actually have to go out and capture that yourself.
Sebastian Scherer:So we have two set ups, uh, one at Butler County Airport, one at Allegheny County Airport capturing ADSB, uh, radio traffic, and also camera data. [00:06:00] And we have published a dataset paper called TrajAir that actually has all of these behaviors. So you can learn social navigation in the airports.
Abate:Is the industry moving towards capturing this, uh, like high resolution data, um, with man-operated planes?
Sebastian Scherer:Well, I mean, I think this morning we heard a lot of talk about, uh, okay, they’ll put everybody in their lane and don’t, uh, you know why this happened?
I mean, this is a little bit of the counterpoint for we with our flying. We say, okay, you know, how, what is integrated with how our operations done today? So, So the not necessarily…
I think people have not discovered the problem yet.
Crowd Speaker: Have you tested this with multiple AI’s and multiple humans?
Sebastian Scherer:So for now just [00:07:00] one-on-one right? Yeah.
Jay Patrikar: The next step is multiple humans and a single AI
Sebastian Scherer:Multiple humans, one AI, and then we’ll do
Crowd Speaker: multiple AI, multiple humans.
no, this actually makes sense. At the end of the day, its about how comfortable the human is with flying with AI’s, .
Sebastian Scherer:I mean, essentially what we are after is I think the equivalent of a, of a Turing test for pilots, right? Like can a pilot not distinguished that he’s not flying with an AI, but instead of flying, with another pilot,
So anybody want to try it?[00:08:00]
Abate:So to implement this since we’re not capturing that data live right now, um, does this mean that pilots need to change the hardware in their planes to capture that high frequency data?
Sebastian Scherer:So in a plane, it’s easy in the plane. You do get it at frequencies needed. Just online websites is like your data captured from everywhere.
Right. So they just get it once every five seconds or something like this. Right. And you wanted it every second in an aircraft. You get it at full rate.
Abate:Got it. Yeah. So the, the data’s all there. It’s just a matter of, it’s not being recorded.
Sebastian Scherer:Right, right. And then gets aggregated into some website. Right. So you have to actually, so, so what is in that actually what’s in the box there there’s like an ADSB receiver, you know.
Um,[00:09:00]
ADSB, and then, uh, right now it’s running inference on one camera here. Um, and so inference, is in twenty frames per second, some, uh, backup optimization that kind of pushes this down to six frames per second. And you can kind of see the, um, playing a video of here. We’re running.. Each image is five megapixels.
And then our end goal is to get to essentially 60 frames per second. So 10, 10, 10 for each camera.
Abate:And it’s all black and white?
Sebastian Scherer:Yeah. So, uh, so that gives you a essentially doubles your resolution. Uh, so the camera’s actually colored cameras right now, but yeah. The right way to do is black and white.
Abate:Why? So why six cameras instead [00:10:00] of a fewer number of cameras that are maybe higher resolution or higher frame rate?
Sebastian Scherer:Those are the highest resolution global shutter cameras that we could get. So, uh, I mean, you would need, so you need a 30 megapixel camera if you want one.
And you want 220 degrees by 40 degrees. Right? So that’s sort of right. So its 220 by 40 degrees .
Abate:Yeah, you need very wide field of view, but not very vertical field of view.
Sebastian Scherer:And it’s, I need to see essentially far enough back because somebody can catch you from behind basically
Abate:and how important is a frame rate and all of this, as opposed to resolution?
Sebastian Scherer:You want to be able to track. So the [00:11:00] most important metric you have to take away is angular rate, and so to do that you need to track over time.
Abate:Yeah. And why is angular rate so important?
Sebastian Scherer:Yeah so if the angular rate is zero, you are basically on a collision course.
Yeah. As in something is coming directly at you. So if its not moving in your field of view then you are on a collision course.
And so the more accurately you can decide if something is not moving relative to you, the better off you are moving to the side
Abate:So for that resolution is key.
Sebastian Scherer:Yeah resolution… And you also want to be able to track across time. If your framerate is too low, you cant really filter that way, you might lose association of who is who.
And do other objects like birds or other things in the air? [00:12:00] Actually we pick up birds and drones in the air even though they are not in our dataset its kind of interesting.
there is a different video, it picks up birds and drones pretty well..
Abate:What is it classify them as?
Sebastian Scherer:As airplanes?
Abate:Uh, how many test cases have you guys ran this through so far?
Sebastian Scherer:so the training is done on.. its called the Amazon Training..
Abate:Yeah. Is the hardware platform something that you guys developed yourselves or ..?
Sebastian Scherer:Uh yes, so the cameras [00:13:00] and Xavier are off the shelf but putting it together is all us, a 3D printing project.
Abate:Yeah. Is that a heat sink on top?
Sebastian Scherer:This is the AGX, the Nvidia AGX.
This is just the dev kit here