A.I. teaches itself to drive in Trackmania

Share
Embed
  • Published on Nov 12, 2020
  • A.I. teaches itself to drive in Trackmania, using NEAT algorithm, which is a particular type of Genetic Algorithm. This algorithm is used to select a neural network with optimal weights, and also an optimal structure.

    Thanks Trabadia ! His TheXvid channel : thexvid.com/user/Trabadia1

    More information about NEAT algorithm :
    neat-python.readthedocs.io/en...

    Contact :
    Discord - yOsh_85
    Twitter - yoshtm1
  • GamingGaming

Comments • 4 036

  • Yosh
    Yosh  Year ago +1693

    Thanks for watching this video !
    This is the first time i'm using NEAT algorithm, so there is obviously still room for improvement. The main problem is that my AI doesn't have a map memory, and can't anticipate "what comes next" with its current inputs. I have some ideas to improve my AI, so don't forget to subscribe if you want to see the next steps of this project ;)

    • Hans Nørløv
      Hans Nørløv Month ago

      I know this is really long ago thing, but it just struck me upon finishing the video: What would happen to your little AI cars if you "killed" any car that hit a boundary after generation 10? I guess it would mean that slower cars would initially pass on their "genes", but I think in the long run those careful drivers could become much faster than those bumping into walls. It might require you to let them learn the track more gradually for there to be any survivors when you increase the run time.

      Also I saw another video of someone training lots of AI in a much simpler manner, but it seemed that all of his AI were trained simultaneously... is it a limit on the game engine you use? Which leads me to consider: couldn't this training be done without any graphics whatsoever? Why not just have a program that kept track of position, direction, speed and the walls and then run it as fast as your cpu could run the cycles? Then to check on performance you load the AI cars into the game engine every 10th, 50th or 100th generation?

    • Best Electronic Music From New Geniuses
      Best Electronic Music From New Geniuses Month ago

      vous avez une voix formidable de narrative en anglais. c est avantageux pour l edition. well done.

    • André Rodrigues
      André Rodrigues Month ago

      Should prioritize individuals that dont hit the walls, as it penalizes speed

    • ZachElrodGaming
      ZachElrodGaming 2 months ago

      You know Trabadia was outed as a cheater, right?

    • koda twelve
      koda twelve 2 months ago

      use genetic algo on a LSTM and it wwill ave a memory

  • MisterFilOfficial
    MisterFilOfficial 11 months ago +4146

    This is exactly how water flows trough pipes. Should we try to put a genetic algorithm on water drops to tech'em flow better? 🤔

    • Mustakrakish
      Mustakrakish 28 days ago

      I don’t think so. The carticles look quite compressed.

    • Peppermint Gal
      Peppermint Gal Month ago

      The resemblance is superficial, but yes, AI have been developed that can simulate physics.

    • HiUfux The VideographeR
      HiUfux The VideographeR Month ago

      Last minute Holiday Gift Ideas... thexvid.com/video/9WFHEmI-q5U/video.html

    • Jack Allen
      Jack Allen Month ago

      @guywiththebottle thank you!

    • guywiththebottle
      guywiththebottle Month ago

      This got 4k likes...

  • Mr Shadow
    Mr Shadow Month ago +320

    Of course Trabadia can help with developing a TrackMania AI. He's got lots of experience using tools in runs.

    • SCAR aw
      SCAR aw 10 days ago

      @Sephikong he would never be one of the best without cheating
      maybe oldest

    • GodSaveTheGame
      GodSaveTheGame 12 days ago

      @Whallop they are still chads tho

    • Sephikong
      Sephikong 13 days ago

      @Whallop rng or "Random Number Generation". Basically luck (but not really)

    • Whallop
      Whallop 13 days ago +1

      @GodSaveTheGame That was because of a new strategy

    • Whallop
      Whallop 13 days ago +1

      @Sephikong mg?

  • Scott Carr
    Scott Carr 7 months ago +649

    One of the big differences between the human driving and the AI is information available. The human learns the layout of the track and optimizes each turn for the next. The AI is only given information about what it can see at any given moment. In other words, the AI is effectively driving the track for the first time every time.

    • Iska Mag
      Iska Mag 18 hours ago

      I think that's a limitation on OPs generic learning program. He could've made a simulation replica of the game, considerably enhancing and speeding up the learning but that'd take lots of effort, especially if the program we're trying to simulate is proprietary.

    • Samuel Ashcroft
      Samuel Ashcroft 6 days ago

      The AI brain might be too small for this, but the AI does kinda learn parts of the track, it's an example of overfitting. It isn't encountering it for the first time every time, necessarily.

    • newellgster
      newellgster 11 days ago

      I more or less agree with Scott here. The "problem", in my view, is simply that your fitness score algorithm is weak and the inputs are too restrictive (or at least what you indicated the inputs to the vehicles were). Some knowledge of the nature of the upcoming track, in terms of the degrees of turns and directions would provide the AI with the same sort of information that a human driver would have. As was pointed out, you actually selected for overall time and effectively ended up in a local minima rather than a global minima because it did not consider enough factors when determining fitness. In short, more info to the AI cars (e.g. where they are on the track and what turns if any are coming up and what types of curves they are would have allowed for an increased number of strategies.... the simple fact that even the best of the AI cars were still hitting walls shows that both the fitness algorithm and the data supplied were inadequate IMHO.

    • Some Nazgul
      Some Nazgul 16 days ago

      I have no clue I this field so please excuse me if I sound stupid, but just out of curiosity, would it make a difference when you feed the AI its own runs so it can remember the track and what worked best?

    • Yulenka
      Yulenka 28 days ago

      It's obvious, you just have to train the AI on screencaps. This way the AI will have exactly the same information as a human.

  • The Coding Train
    The Coding Train 7 months ago +887

    Watching this video and zoinks there I am! Amazing work!!

    • Darth Maul
      Darth Maul 2 months ago

      @Yosh This is a lame video ai has been able to do this since the 1970s

    • ventrys
      ventrys 4 months ago +4

      @Pro Logic ok

    • Pro Logic
      Pro Logic 4 months ago +4

      @Brad th

    • Brad
      Brad 5 months ago +5

      @Yosh the

    • Yosh
      Yosh  7 months ago +75

      Thank you so much !! I loved your video series on genetic algorithms, it helped me a lot in the beginning ! Very happy you came across this video :D

  • Khalid Al-Mohammed
    Khalid Al-Mohammed 3 months ago +251

    Try this: if the car hits the wall, then remove a point. This should make the AI learn faster because hitting the walls will let them learn to not hit walls.

    • ᚛TђeͥDoͣcͫτor᚜
      ᚛TђeͥDoͣcͫτor᚜ 14 days ago

      @Designator not accelerating applies to the start of the game, this is to encourage the car to move.

      Maybe I didn't use the right words then

    • Designator
      Designator 14 days ago +1

      @᚛TђeͥDoͣcͫτor᚜ Your suggested solutions are way too prescriptive. We want the AI to find the fastest route through a track. If you look at actual TrackMania records you will find that _not accelerating, breaking,_ and _hitting walls_ are sometimes part of the fastest strategy. If you penalize these behaviors you will end up with cars that are nice to look at while driving, but probably won't beat any records.

    • ᚛TђeͥDoͣcͫτor᚜
      ᚛TђeͥDoͣcͫτor᚜ 20 days ago

      @shadow TOC another condition is also set above for acceleration

      If it accelerates, there is a +1 increment in points which serves as an incentive to move

      No movement leads to punishment

    • ᚛TђeͥDoͣcͫτor᚜
      ᚛TђeͥDoͣcͫτor᚜ 20 days ago

      @shadow TOC as I've stated above already.

      They are being time - limited, so they are forced to find the fastest way possible, without hitting a wall to get to the finish line.

      This will automatically prevent it from driving slow, spinning around, not driving at all or driving very cautiously

    • shadow TOC
      shadow TOC 20 days ago +1

      but what if they decide to just drive slow a cautious or not drive at all

  • David Thomas
    David Thomas Year ago +2477

    If you were to introduce a stronger penalty for hitting the wall, such as ending the run right there and not letting it progress, would a stronger rule like that ensure the 'gene' for clipping the walls was removed?

    • Shawn Elliott
      Shawn Elliott Month ago +1

      That rule would be inaccurate. There is no inherent penalty for hitting a wall in racing. You don't even lose points for going completely off the track, unless doing so allows you to pass someone who stayed on the track. The ONLY penalties that really exist are vehicle damage (reduced performance and cost of repairs) and slower lap times.

    • CaedenV
      CaedenV 2 months ago

      I suspect the issue is the reys being used to measure wall distance. Once you get past the first tight turn it is like the AI cannot see where the wall ends and you end up with blank spaces along the most efficient routes. Here he is essentially maximizing for the furthest distance measurement seen, but may need a higher resolution of vision for it to perform well.
      ... and I realize I am making a ton of assumptions and could be totally off basis here; but that appears to be the emergent pattern.

    • Manz
      Manz 2 months ago

      @Tomas H computers won't ever match humans without the data provided. All least not in a timely manner.

      We see that they are vulnerable to simple mistakes one could call childish. This is why self driving will always be a pipe dream. Assisted driving. No problem. Completely autonomous car sounds frustrating as hell and even more importantly - dangerous-

    • Hui Rong
      Hui Rong 2 months ago

      @cinegraphics Agreed, this is similar to how speed-runners use game physics/glitches to their advantage to achieve new time records.

    • Seth Math
      Seth Math 3 months ago

      I feel that coming to a stop or reversing should end their life but tapping a wall? Nah the ai could be on to something as long as it doesn't slow down too much

  • Enkrod
    Enkrod Month ago +57

    Trabadia is very well suited to assist a tool, as he has received so much help from a tool assistant himself. :p

    • Dil Drawz
      Dil Drawz Month ago +4

      Was looking for a comment on Trabadia lol

  • Emerson Baur-Swofford
    Emerson Baur-Swofford 8 months ago +458

    Left, Right and Floor it. The only three inputs a true racing car ever needs.

    • George B
      George B 26 days ago

      @Manz Handbrake in a fwd car is kinda faster sometimes, if it is done right, or slip angle. But it needs to be done carefully.

    • George B
      George B 26 days ago

      You need to teach the AI to breaking. If not, it is going to fly off from some turns.

    • ke6gwf - Ben Blackburn
      ke6gwf - Ben Blackburn Month ago +4

      You can get rid of one of those inputs in Nascar. Makes it simpler... Lol

    • Reuben Betts
      Reuben Betts Month ago +3

      @Manz have you played trackmania before?

    • Manz
      Manz 2 months ago +1

      @KappaKappaKappaKappa it wouldn't be as efficient so it'd probably cut all the nonsense out

  • John Smith
    John Smith 7 months ago +21

    I feel like using shortest split times from one checkpoint to the next might give better results. Maybe there's a car that doesn't do sections 1 and 2 very well, but kills it on section 3 where every other "well performing" car is having lots of trouble. If your fitness function is only time to end point, you're going to miss out on some more targeted improvements.

  • Jake B
    Jake B 7 months ago +98

    13:33 this just makes me think about how long this must have taken to render

  • 21thCenturyFrog
    21thCenturyFrog 11 months ago +1957

    Trackmaina in this form is a quite sophisticated liquid simulation.

    • Legendized
      Legendized Month ago +2

      Fluid simulation*

    • StronglyImplied
      StronglyImplied 8 months ago +1

      quack

    • Silas Willingham
      Silas Willingham 9 months ago +1

      @Verrtex dat part wide boi

    • Lizzie
      Lizzie 9 months ago +8

      Totally not kidding. I bet if you layered more hough level runs at increasingly delayed start times it would appear even more accurate

    • Akari Insko
      Akari Insko 11 months ago +18

      @Verrtex thats trippy

  • pyramidbuilder
    pyramidbuilder 2 months ago +11

    Hey Josh! Your AI driving algorithm reminds me of water flowing down a tube... which is almost the opposite of how F1 drivers drive; as they hug the corners rather that rebounding off the opposite walls. Really interesting experiment! Thanks for sharing.

  • Mike Hoyer
    Mike Hoyer Month ago +3

    This is awesome. It looks like the AI learn a very good "Reactive Control" method. I would think to learn more of a "Feed Forward" control system aka "Learning the track" then higher resolution input to the net would be needed with more neural layers. The fact that the AI never learned to hug the wall is very telling. It's staying away from the wall to not bump into it, aka reactive control. You made a very safe, but not so fast algorithm. This is amazing work! I look forward to your future videos.

  • Stig
    Stig 5 months ago +44

    My only issue with generations (with know experience and just watching youtube) is you see a good contender that isnt the fastest (e.g. doesn't hit a wall but comes second) and it gets scrubbed - in a generation or two might acctually take over the current wall smashing leader.

    • Klobi for President
      Klobi for President 19 days ago

      @D. K.
      One assumes not as that's basically never done.

    • D. K.
      D. K. 27 days ago

      @Matthew Smith Oh, is he culling the worst 99%?

    • Go Super Sheep
      Go Super Sheep Month ago +1

      The term for what youre describing is a local minima/maxima - the algorithm effectively gets 'stuck' within the search space. Its why, as the other commenter says, you dont kill all but the winner, but its also the reason for random mutations being introduced!

    • Matthew Smith
      Matthew Smith 3 months ago +23

      That's why you usually only cull the worst 50% instead of the 99% that didn't win.

  • Toma
    Toma Month ago +1

    Genetic learning has always been cool to me. It gives the AI a sense of life and feels like teaching your kid something new with them steadily getting better

  • Bix G
    Bix G Year ago +4695

    Can't wait for the implementation of the brakes in order to see the AI drift !

    • Isaacs Random Videos
      Isaacs Random Videos 22 days ago

      Cool! Amazing! Brilliant! Please help me!

    • Селим Каюмов
      Селим Каюмов 26 days ago +1

      AI86 dorifuto

    • George B
      George B 26 days ago

      @GOLF BOY Have you ever heard of weight transfer or left foot braking.

    • Manz
      Manz 2 months ago

      @MasnySunshine that's not drifting that's powersliding.

    • Chris Manuel
      Chris Manuel 6 months ago +3

      I love that there's a million people arguing real life physics against a video game. Nobody tell them about Mario Kart and charging your drift (which you brake to perform) for an acceleration boost

  • Nathan Mennel
    Nathan Mennel 6 months ago +5

    The way that first car morphs at 13:33 is amazing. This was extremely visually interesting

  • John C
    John C 7 months ago +965

    The French is strong with this one.

  • Jakob Leck
    Jakob Leck Month ago

    Great video! Suggestion: The decision making input that is optimized is trained against a global constraint (distance travelled or time to checkpoint), whereas the effect of each steering input is only over the next few seconds. Maybe evolution might run faster or yield better results if you train the decision process with a fitness function that depends on those short-time effects, e.g. the average velocity over a certain short time interval?

  • Mike K
    Mike K 7 months ago +1

    BRAKES!!!! awesome video and ability to do the programming interactions, but you can't compare people that have brakes to an AI that doesn't. You can see the AI strategically hitting walls instead of braking. I would love to see a video of brakes added and have a running of 1000 generations on a variety of courses to see where it can go. It would be also interesting at that point to see if you can merge different styles like a "marrying and a child" kind of thing.

  • Kilian Klein
    Kilian Klein 5 months ago +6

    What if the fitness function included the shortest distance driven to reach the finish line as well as the time? (with a timeout for those that never make it anyways of course) Wouldn't that make the AI learn that straight lines are usually faster?

    • Lex Mitchell
      Lex Mitchell 4 months ago +1

      @killian Klein although that might optimise better behaviour on straights sooner, it is quite likely that would be a negative overall. A typical racing line is about the average speed a car can hold around a turn as well as the distance. Optimising for distance could hurt corners, optimising for time should eventually give the best model.

  • Victor Henrique
    Victor Henrique 4 months ago +2

    Great video! Currently your AI have a reactive approach to each frame of information it receives. Maybe you can give to the AI the current frame along 3 or 4 past frames so it can generilize the concept of speed and direction.

    • koda twelve
      koda twelve 2 months ago

      cause it use a NN instead of a LSTM

  • SpaceTechMadness
    SpaceTechMadness 5 months ago +3

    What if you mixed supervised with evolution algorithms in a way that it can take the data you give it, and improves upon that? And also put the ai on a different track with every new generation.

  • Sean Bychowski
    Sean Bychowski Month ago +1

    Hi, very interested into going into a career involving stuff like this. Was wondering where you learned how to do all this.

  • GekoPoiss
    GekoPoiss Day ago

    It would be interesting to learn more about how you interface with the game programatically and run simulations in batches

  • Brandon Fortner
    Brandon Fortner Month ago

    I love how you explained all the science behind it, it allows me to understand how these AI work on a conceptual level

  • Small Ejector
    Small Ejector 2 months ago

    The best AI explanation I have seen, makes it very understandable, thank you.

  • Silas Stryder
    Silas Stryder 5 months ago

    That's crazy how the AI systems differ, one can improve seemingly infinitely through diminishing returns until it's just a Tool-assisted speed run that literally can't be better, the other copies the best example we can provide and, in theory, would reach that example and no longer improve if given enough generations.

    One is far slower however, so if you had, say a speed runner to collect data from it'd be interesting to see how many generations it takes the genetic algorithm to overtake the speedrunner supervised AI.

    This will become more publicly understood as AI is implemented in more aspects of everyday life. Say, a genetic algorithm for a Google-scale AI that takes 5 years to "grow" into a retail-quality state and improves a bit more each year compared to those new robo-dogs imitating K9 units and take a few months to learn how to ascertain a threat and pin them or disarm them by following veteran K9s collected data and supervised learning.

  • Der Muschelschlürfer
    Der Muschelschlürfer 2 months ago +2

    Try giving it the goal of the lowest track time with reinforcement learning, would be interesting to see

  • GameGlitcher
    GameGlitcher 7 months ago

    I think the part missing from a truly masterful AI will require sensory information, like the knowledge of more information of the shape of the track.
    I believe there is likely a smooth surface function that could be created and then a point object attempting to find the shortest path to the ending. If the AI happens to find a shortcut from one of its paths it adds a new surface to the 'map' and essentially creates an alternate route. This would not rely on the input data from the front probes because the AI would have access to the function to determine the best places to look at to determine its route.

    I have not dabbled in AI before but would be happy to try and help with this approach. Lets just say I am an over-qualified youtube commenter lol.

  • Reiners Eulenspiegel
    Reiners Eulenspiegel 8 months ago

    Very cool approach, nice work.
    When i look like your later generations, it looks like the wiggeling (driving in wavepattern left and right) from the "drive as many meters as you can" goal is still a major behaviour that won't be corrected so quickly. Or is it the controlling that acts too slow or too harsh? (AI "thinks": Oh im too close to the right, lets steer left. Oh im too close to the left.... and so forth) Maybe it could help to make extremely minimalistic strait track with a few checkpoints for the first generations, so the Algorithm gets, that just accrelating without much steering is an option to consider from time to time. (checkpoit/finish times as goal to compare)

  • xenomancer
    xenomancer 3 months ago

    The fitness functions you chose were very limited. Try combining track distance and checkpoint time as a simple modification next time. Once cars are finishing races you can use the mean distance traveled by swarm members completing the race as a benchmark for the fitness, the same for the time to complete the race. A completed race means all checkpoints were met, so using all checkpoint times becomes less important.

  • Everything Outdoors
    Everything Outdoors Year ago +2104

    Isn't it strange how this looks just like flowing water.

    • evolve
      evolve 2 months ago

      @Ilan Lee Yes least effort after many failures.

    • evolve
      evolve 2 months ago

      @Javier Segura That looks like flowing "digital" water.. As an example.

    • evolve
      evolve 2 months ago

      You can learn from water in many deep ways.. Yepp

    • David Harvey
      David Harvey 9 months ago

      Yes. Thats why you dont leave the tap dripping, cant have too many winners.

    • M. A. Packer
      M. A. Packer 10 months ago

      Makes sense since water finds its own level. AI eventually figures out the limits of its programming

  • maxhouseman
    maxhouseman 3 months ago

    Very interesting! For me the movement of all cars looks like a water stream moving through a pipe.

  • MalevolentDivinity
    MalevolentDivinity 7 months ago

    One thing worth considering, m'thinks, would be having different fitness functions to encourage more divergent strategies.
    Like, in this case, you could have the best time to the first checkpoint, the best time between the first and second, the best time between the second and third, the best time between the third and fourth, and then the best time for the entire course. Best five in all categories, each have four descendants in the next generation.

    Might also be worth considering making deductions whenever it hits the wall and giving it bonus points for the lowest distance traveled after finishing the race.

  • Sean Garratt
    Sean Garratt Month ago

    very cool! perhaps add some track pre-analysis to plan the most efficient line through the course and have the AI try to ride the line at the highest speed. I don't think this would be cheating. A human driver would build this knowledge after a few runs thru the course. How to build that line would itself be an AI challenge taking in to account pre and post turn lateral momentum and take maximum advantage of straight-aways.

  • Freak
    Freak 7 months ago

    Interesting video, the ai learning is quite effective. The problems I see is that the information available for the ai is just not enough to calculate the best curve trajectory. Its like racing at top speed with a view distance of 5 meters. Using a more complex raycasting setup for the curves might solve the problem, for example: triplethe cast rays and create another group with a forward offset of around 2 to 3 car lengths each.

  • Reflex UK
    Reflex UK 5 months ago

    May sound silly, but can you program it to know there's an end location to reach and this is the sort of path you have to stay in?

    That would then give the AI a generalisation of which direction it needs to drive, then overtime it would learn the mechanics better for faster lap times?

    • J F
      J F 5 months ago

      Don't think the IA could place herself precisely on that map. It's like giving google map to a baby.

  • Ferociousfeind
    Ferociousfeind 5 months ago

    Based on that best-time AI run, it seems the first thing the AI learned was that turning too much is more costly than turning too little and bumping into walls. Interesting.

  • yd5
    yd5 7 months ago

    This is really interesting stuff. Also pretty trippy to watch at the end. Thanks for posting this!

  • Rollin' Dutchy
    Rollin' Dutchy 3 months ago

    Good video, very clear. Couldnt you add an input of hitting a wall/object? So the mutations try to stay away from walls while driving the fastest route.

  • Heybrine
    Heybrine Year ago +1272

    13:13
    Everyone: making progress

    That one car going backwards: heheh, Imma be different

    • Hui Rong
      Hui Rong 2 months ago

      Its the 1 guy who thinks that going back to the start point allows you to reach the end quicker. Unfortunately, the start & end points do not meet ;)

    • Myname'sPedro_L
      Myname'sPedro_L 9 months ago

      @anonymous all of humanity is the backwards car

    • Daniel
      Daniel 9 months ago

      You say you want a revolution
      Well, you know
      We all want to change the world
      You tell me that it's evolution
      Well, you know
      We all want to change the world

    • Joe
      Joe 10 months ago

      that one guy in a forza lobby:

    • AA
      AA 10 months ago

      @Dus-DB cool, i don't play, just recently followed "wirtual" and I amazed how people of this games are dedicated and find the weirdesr shortcuts. An AI would need completely different grading i would assume, as a shortcut could be slower initially

  • David C
    David C 7 months ago

    Another important component is the addition of more senses with negative performance values, such as a way to detect and penalize for collisions or driving the wrong way.

  • MileHigh Twin
    MileHigh Twin 27 days ago

    Keep at it man!!! You can do some impressive stuff with AI if you get it down. You got yourself a new subscriber

  • Leo Nigro
    Leo Nigro 5 months ago

    Maybe have an optimal driving line as a parameter(feature?) and give them the ability to brake. Ironically, in these types of games, bouncing off walls might actually be faster than braking.

  • Aaron Murgatroyd
    Aaron Murgatroyd Month ago

    I think the AI needs more information about the track ahead, the data you are feeding it only tells it what is immediately in front of it, when a human drives they remember what is coming up and can adjust their cornering curve for the corner after the current one. Start feeding the AI two sets of data, one for close walls, one for walls further away, the AI can then have more immediate information rather than information that is queued on timeframes.

  • Hui Rong
    Hui Rong 2 months ago

    I am thinking that expanding on the 'memory' idea of the map for genetic algorithm would help improve its performance.
    Some ideas off the bat:
    1. Simplify map input into a chain of data point set (e.g. position, angle of turn to the left/right etc), similar to how notes are recorded on a score sheet
    2. Have real-time positional input data for the neural network to know its progress on the track
    With these information, the genetic algorithm might learn how to make predictions and optimize moves to handle upcoming portions of the map, based on its current and future data point sets.

  • Jtryy4 Stryturu
    Jtryy4 Stryturu 23 days ago

    Would be interested in seeing a video about the coding and model building/selection in this project

  • Florin Balanescu
    Florin Balanescu 8 months ago

    From my experience, Q-learning (or DQNN if you're also using it in conjunction with NNs) it's a much harder problem to solve/converge. For the generational approach of the RL, have you thought of having some other information picked up way in front of the current car like, the next turn/curve or the next 2 turns as 2 additional input floats, or even the distance to them (+2 floats)? That would be to alleviate lack of track memory. It is clear that a human that masters the track would first have to learn it, so in order to match it with an AI, we would probably want to help it remember the next turn or couple of turns in advance. Waiting for your next videos.

  • dascandy
    dascandy Month ago

    Have you considered hooking up the network twice? Instead of taking the raw output use the network both for the actual track and for the mirror-imaged track (with inverted inputs), and sum the two outputs. Driving the mirrored track should be identical to this (handling wise) so that would get two training sessions out of a single run. Also, you likely can use either a smaller network or get more depth with training.

  • IAMDIMITRI
    IAMDIMITRI Year ago +196

    So the cars don't seem to be able to predict a turn. If you want them to be able to predict a turn you need to eather increase the resolution, so that AI would be able to see an oncoming turn. Or increase the neural resolution both to allow cars to process different turn radius and temporal resolution so that cars can hold and remember certain turns.
    Small or wide turn do look similar to the AI and AI needs a way of distinguishing between them.
    That's the difference between simple stimuli respondent AI and another one that can better generalize the problem.

    • Inseerlink
      Inseerlink Year ago

      exactly - "supervised learning" being "better" than open ai learning just means the ai is plain bad designed

    • Mavairick
      Mavairick Year ago +1

      Well, most self learning AI depends only "reaction" to instant situation and use no "memory" which is why it's very good a generelising a solution to similar situation. The thing is, Trabadia or any human running the map would do several runs and as you said, remember the orders of turn and we then simply anticipate our position for the next turn.

      That's the point where learning AI and Human meet up. you first discover the game, then once you know how to play, you discover how to run the specific track. Like i've seen some player do blind runs, it's no longer about knowing how to play but pure memory of inputs XD

    • Blyat Bomber
      Blyat Bomber Year ago

      @Tom Mosher If the goal is highscore then you need to approach this problem from a different perspective. Neural networks with sensory input are NOT the way to go here. Neither is overfitting.

    • Tom Mosher
      Tom Mosher Year ago

      Real-life race drivers take time to learn each track. They're not perfect on the first run either. So some amount of overfitting seems appropriate if the goal is quick times, rather than making a general driving-bot.

    • Yosh
      Yosh  Year ago +11

      @Harmen Oosterhof I already trained a new genetic algorithm with a fitness function taking wall hits into account, you will see the result in my next video ;)

  • Kail Labs
    Kail Labs 3 months ago

    It seems to me that the reasons your ai want improving being that point is that the penalties for making minor mistakes are not harsh enough. Ie a simple time based selection isn't enough you should add additional penalties doing things that increase time/reduce speed making the environment artificially harsher. Example add time to the run every time a wall is hit to reinforce more strongly that hitting walls is bad.

  • Casey Howden
    Casey Howden Month ago

    When you ad extra time you see the brilliance of the ai, it applies what it had learned from the previous part

  • Gabe Keeter
    Gabe Keeter 2 months ago

    Have you tried using the ppo algorithm? It's well known for it use in classical controls and as such might be useful here

  • Lee Myers
    Lee Myers 7 months ago

    I think it would improve more if you gave it a new track every generation. It would be more complicated to program but the changes would force it to be more generalized.

  • Lzfix
    Lzfix Year ago +111

    Damn every episode is better than the other, this project is just too cool man, keep it up! :D

  • Michael Hay
    Michael Hay 25 days ago +1

    What if you changed the goal of the network to favor highest average speed? Based on the video alone it looks like the AI was learning that hitting certain walls was beneficial. They kept bumping into the same couple of walls towards to the end of the track in order to turn by the looks of it, but that lowers their time.

  • Ron Hopping
    Ron Hopping 2 months ago

    Could include collisions as a condition of the learning. That would likely bring the generation number down.

  • sycips
    sycips 18 days ago

    Impressive that you created this for Trackmania! Two major things you could have added are penalizing bumps, which slow down the car, and make the AI able the break, which is necessary in some tracks. Even though this isn't added, it was fun to see what your results were. Great job!

  • kopa shamsu
    kopa shamsu Month ago

    You should have changed the objective function into two parts, maximize the distance traveled d(x) and minimize the number of hits on the curb h(x). Then the objective function becomes something like f(x) = 1/d(x) + h(x), and minimize f(x). Then you would have seen a better convergence. You can also add importance of each criteria too, like f(x) = (w1 * 1/d(x)) + (w2 * h(x)). Even better if you could normalize d(x) and h(x) on to same range [0.0, 1.0].

  • RegamusMaximus
    RegamusMaximus Year ago +944

    I feel like not using wall hits as an elimination category was an oops moment

    • Kostas Papadopoulos
      Kostas Papadopoulos 7 months ago

      I think that the whole idea "hitting the wall could be beneficial " is not applicable here. Trackmania is known for the use of such tactics, but humans have more information about the environment. In this case, Ai had information only about the map in front of it, only some distances to be exact. Thus, it would be impossible to develop any behavior in which the car could accurately predict a favorable collision. Even if it did, it would be overfitting. Because it would require knowledge of the rest of the map. Hence, maybe adding a negative score for each collision would enhance the learning.

    • PatalJunior
      PatalJunior 7 months ago

      @mark feeer21 This is a great idea.

    • PatalJunior
      PatalJunior 7 months ago

      @Ziero He could had made a score based on time it took to complete chekpoints, and lower the score if hit walls, so that AI tries to not hit walls, instead of just trying to stay in the middle.

    • Peter Pike
      Peter Pike 7 months ago +2

      @Lasse R. -- "To anyone arguing against this.. look at the 'pro' player.. how many walls did he hit? 0." If the pro is the standard, then just have the AI replicate the pro. No need to teach anything. On the other hand, if you want to use random variation to come up with the BEST solution overall, you have to let the AI decide things that appear to us to be "stupid." How many optimal strategies are missed because people never even consider that they would be viable? You need someone who thinks "outside of the box" to discover the alternate path. That's why the only thing that should matter is the goal, and you let the process try as many (and in an ideal universe, every) possible paths it can until you find the genuine optimal one.

      Also, since it's a video game, it's possible that the optimal path will be a glitch. Maybe hitting the wall, then pressing both accelerate and brake at the same time and releasing both buttons repeatedly for 1/2 second causes a buffer overflow and gives you extra acceleration. It's not likely, but you're not going to find bugs like that using normal gameplay since normal gameplay is what they test when they look for bugs.

    • Taylor
      Taylor 7 months ago

      Wallhits can and are used in trackmania speedruns for optimal times. A way to rapidly change direction at the cost of some speed. Not bad if you're, say, just needing to reangle yourself for a chicane with a turn after you'd be going too fast for anyway. Much like this map.

  • Careless' Coaching
    Careless' Coaching 7 months ago +14

    adding another fitness factor for having travelled the least amount of distance while still finishing the race would get MUCH closer to the pro driver.

    • Careless' Coaching
      Careless' Coaching 5 months ago +1

      @Timothee Lfbv I would argue complexity is a requirement to reach a perfect line, as there as a great number of factors in the concept.

    • Timothee Lfbv
      Timothee Lfbv 5 months ago

      @Careless' Coaching Sure, i agree ! It will be faster indeed, and it could be even faster with other informations but it will be to complex may be to give a car the perfect raceline !

    • Careless' Coaching
      Careless' Coaching 5 months ago +1

      @Timothee Lfbv Totally aware as they take the line that offers the least distance weighed against the least loss in speed when rounding turns, but this addition of "complete the map with the least distance" is only ONE fitness metric that would be balanced against the others by the algorithm.

      Shortest time, least distance, highest top speed etc. Just because one isn't the only factor doesn't mean it's not relevant.

    • Timothee Lfbv
      Timothee Lfbv 5 months ago

      Pro drivers dont use the raceline with the least distance

  • OldSlowGamer
    OldSlowGamer 7 months ago

    This AI is missing some important algorithms for learning the fastest way around the track. Those would be algorithms for finding the shortest line around it, and the concept of the apex of the turn that goes along with it.

    Next level additions: G meter, tire adhesion parameters, braking function. The ultimate line is the shortest, the ultimate braking, acceleration, and cornering profile keeps the tires at their adhesion limits but never exceeds them.

  • Nicolay Hoven
    Nicolay Hoven 7 months ago +1

    I feel like genetic algorithm is only limited by what you let it see.. if you allowed it to look ahead, like humans can in the game, it would likely do significantly better.

  • Alex Worden
    Alex Worden Month ago +3

    This is awesome! Thank you for putting this together. As a human, your attention is highly focused on the track ahead in order to take the curves correctly. I don't think you're giving your AI enough input to work with. I wonder if you'd have more success if you gave it a more details of "distances to the track edge" for the center of the field of view.

  • luvinthe jazz
    luvinthe jazz 2 months ago

    The final runs remind me of water flowing down a stream bed. It shows why a river has curves. The particles collide with and erode the outside of the curves, and leave deposits on the inside of the curves.

  • Wilco
    Wilco Month ago

    I think it might help if more detailed input information is given to the AIs, with only the distance in certain directions you cant really see or anticipate what is up ahead, which in my opinion makes a big difference!

  • Knight Before Dawn
    Knight Before Dawn 8 months ago

    Just a thought for more efficient learning using the genetic algorithm. Have each checkpoint be worth (10,000+(cp*10,000) - time passed in milliseconds) fitness. This way faster paths have higher fitness. And have the finish be similar but more total points and more value loss due to time.

  • Trevor Croteau
    Trevor Croteau 2 months ago

    The final compilation made me think of how fluidly dynamic the machine learning attempts look. Very weird.

  • nak attak
    nak attak Year ago +608

    This just goes to show that even in a world where geniuses are all around you, some idiots decide to bash their heads on the wall instead

    • QuantumSoul
      QuantumSoul 9 months ago

      It's more people born without legs

    • : B
      : B Year ago +3

      @sulosky thats deep

    • TheRedbikemaster™
      TheRedbikemaster™ Year ago +18

      @sulosky brute force method

    • sulosky
      sulosky Year ago +45

      Enough idiots banging their heads will finally get through.

  • William Knowles-Kellett

    It's interesting to me that they take wide turns. Because their inputs come from raycasting, they can't see around the turn like Trabadia. By taking the turns wide, they can see better.

  • Nemesis666first
    Nemesis666first 7 months ago

    Very impressive, in particular because you was able to do it in a commun game, and not in a python game.
    I tried to learn a bit about machine learning, but it's always about game that you have to create yourself (even if it's basic, this is the part I dont like).
    I would like to be able to make it working with regular games, to see how IA is able to perform, exactly like you did on Trackmania.

    I will follow the link ;) .

    Thx for the video and the work :D !

  • Marty´s Coding Palace
    Marty´s Coding Palace 3 months ago

    Amazing video! How did you manage to extract the information for your Neural net´s inputs (Distance to the walls,speed,...)??

  • Do Your Own Research

    i havent watched the whole thing, but my initial question is if they are slowly learning this track or can it also be generalised to be good at new tracks quite quickly too.

  • Siklone Gaming
    Siklone Gaming Year ago +473

    20 generations in and I'm still the car hitting the wall at the start line.

    • Akimo
      Akimo Year ago +3

      @DFDempire Jokes. You know what jokes are, do you?

    • 707 Beats
      707 Beats Year ago

      @Siklone Gaming lol

    • Siklone Gaming
      Siklone Gaming Year ago +13

      @DFDempire what? I'm aware of how this is structured. I was making a joke.

    • DFDempire
      DFDempire Year ago +13

      Genetic mutations.
      As we evolve ressesive traits can sprout causing the mutation.

      It's based of survival of the fittest but has bad traits. Over time there will be less and less until it's no more.

      It's literally evolving and changing it's (DNA)

  • Suckynewb
    Suckynewb 8 months ago +7

    Yea the biggest flaw is that it's not really learning "How to drive", it's just learning to navigate this track through trial and error.
    If it was able to "see" walls and knew walls=bad, it would be able to learn new tracks and possible complete them on the first run. Then throw in some scaling rewards for reaching checkpoints in a certain amount of time and you've got a racer.

    • The [Redacted] Failure
      The [Redacted] Failure 8 months ago

      The biggest issue with that is, in many cases in high level trackmania gameplay and other things, out of bounds bugs, and bug abuse gives advantages, as something called wallbanging can be very useful in some maps, so what I'm saying is no matter what the AI can not reach human levels, unless it makes radical and ludicrous advancements or has a unrealistic amount of time as human computing power is nowhere near that level yet

  • Matthew
    Matthew 6 days ago

    I've never played trackmania but I assume there's speed loss from bumping the edge, right? If so, creating a penalty for touching the wall but still favoring time overall could have been another parameter to increase efficiency.

  • Ben Fluke
    Ben Fluke 7 months ago

    i feel like the main reason you had that giant misstep after you added more of the map is precisely because they didnt have access to the information for too long that the artifact of bad decision making would have, as you said, taken to long to 'breed out'

    I would be greatly interested in another clean run from gen 0-100 with them having full access, with perhaps a few more checkpoints to give useful data in the start when they will need to strike the checkpoint almost by chance the first time.

    just a theory.

  • DrZaius3141
    DrZaius3141 6 months ago

    I feel that the AI has problems with generalization because at the beginning of the track it learned to always go full throttle, whereas in the later stages it has to slow down a bit - which it never learned.

  • jnmaietta
    jnmaietta Year ago +27

    One thought, could the AI be slamming into walls because they don't have brakes built in as a response? From what you mentioned, they can only turn and accelerate. Would explain the lack of learning and inability to approach human times past that certain point.

    Anyways, the whole video is awesome and I wanted to give you specific props on the editing and overlays on this. Really help visualize concepts. Idk how you select which runs to run together in clips, or how you got that shot sitting in the middle of the track with cars going by, but they were great visualizers.

    • Yosh
      Yosh  Year ago +10

      Thanks !!
      You don't need brake on this specific map, Trabadia didn't use brake in his run for example. And the AI is still able to stop accelerating. But brake would be useful in more complex maps.
      It's easy to sort replays in folders, and to select and edit specific replays ingame. And there are tools to edit camera shots ingame.

  • ruukaoz
    ruukaoz 3 months ago

    it would be interesting to so see how the A.I. would tackle the challenge if the map would mutate in slight ways, and how it would affect the learning.

  • Trashpanda
    Trashpanda 3 months ago

    Can't you do supervised learning at first and then swap out the value function and continue with a genetic function? I would also add penalties for things like bumping into the sides.

  • Francois Kirsten
    Francois Kirsten 5 months ago

    I know it's 8 months after, but couldn't you maybe also use the time between checkpoints instead of time from the start? Then maybe the AI can use the data from the best 'between checkpoints' to put together the best possible time

  • jason vonhaartman
    jason vonhaartman 3 months ago

    Could you do an award system, where if you go fast you get more score then going slow, less for turning and a big no for hitting the wall. Oh and a bit for the Distance moved down the track.

  • diariaking247
    diariaking247 Year ago +187

    The AI could probably do better if there are more “sight” lines clustered toward the front allowing it to make more precise movements.

    • French geek
      French geek Year ago

      thexvid.com/video/yZFY5ZJtgyM/video.html

    • Tesla Model 3
      Tesla Model 3 Year ago +1

      I wonder if it remembers the next turns. If not, it will never improve as much. It just learn to drive in a track as if it is the first time it drives it. I think Trabadia drove the map some times before realizing the best time. Is it true? How many times did he try?

    • johan wendell - original music
      johan wendell - original music Year ago +1

      @camilo hurtado acero That's because the AI was trained to just finish the track, not get the fastest time.

    • camilo hurtado acero
      camilo hurtado acero Year ago +10

      yes, AI resolution of data is to weak. But at 11:45 of the video, the AI makes a corrective movement to the left, because he wants to be in the center of the track, not on the fast lane. You could prioritize the ahead distance so AI dont make to much of an anticipate move.

    • Jouni Ranta-Puska
      Jouni Ranta-Puska Year ago +21

      I was about to say this as well. As it is now, I feel like it's handicapped compared to a human.

  • Anton Nym
    Anton Nym Month ago

    Super-interesting! I found you by accident and subscribed immediately. Thank you! Very smart stuff. I love it.

  • Josh Ellis
    Josh Ellis 7 months ago

    I feel like quantum computing along with machined learning will be the key to unlocking true AI

  • Hunters Mark
    Hunters Mark 4 months ago

    I'm sure every single one of those AI received a trophy for participating

  • TheBlankFaceGamer
    TheBlankFaceGamer 6 months ago

    I think the biggest issues with the A.I. is that it's learned that the best way to get a fastest time is to not turn into walls, where the fastest way to make a turn is to stick as close to the inner wall without touching.

  • EvilTaco
    EvilTaco Year ago +68

    I feel like the reason they're not performing as well is that they're very limited in what they can see. They can only see the walls right in front of them, so they can't think ahead for the next curve and account for it, which is why they always run into that one wall in the curve

    • EvilTaco
      EvilTaco Year ago

      @Trygeve bruh how do you keep finding me everywhere

    • Trygeve
      Trygeve Year ago +1

      hi taco.

    • Dragoonsoul7878
      Dragoonsoul7878 Year ago +3

      @AE Templates Well actually it is we can see over the walls, even without know the turns ahead of time you can see the track ahead of you. If you want the test to be fair... you'd need to recreate the track with walls a human couldn't see over... in which case you'd see humans advance similar to the AI. Slamming into the walls or going too slow... while we'd learn "faster" we'd learn basically the gap between generations.

      In other words, comparing AI to humans here is completely unfair as we have two different tracks.

    • Amicaze 95
      Amicaze 95 Year ago +2

      @AE Templates Rather than making it map specific, a better solution would be to give a line of sight that correspond to different track pieces. For instance if you use circles and curves as LoS, the AI will be able to see past the bends into the turns. The good thing is that with this kind of learning algortihm, you shouldn't need to do complicated stuff, the AI should figure it out on its own.

    • dubtor
      dubtor Year ago +6

      seems like the the measured input parameters (wall dists and speed) are reaching its limit regardless of the number of future iterations. Certain curves or curve combinations look "same" to the AI whilst in fact the AI should understand that they are not the same ahead of time (by measuring other/additional parameters). Because there is a limited number of curve types and thus combinations of them in TM one could try to make the AI "see" which ones they have at hand and learn accordingly. Imo this way the record of the reference driver may be broken. Also does the AI steer inputs between 0-100% or always 100%? Maybe this adds extra friction and therefore slowdown?

  • Daims
    Daims 7 months ago

    It may also be an issue with the fact the ai doesn't know what will happen next turn as an human would know. Which makes some turns impossible like the last one where the ai keeps speed too high as it is how it takes such turns but doesn't know the next one require less speed to be made without hitting a wall.

    Would it possible for the ai to see from the top of the race with freecam or is it too difficult to be made ?

  • knash97
    knash97 Month ago

    Could you take your fully trained supervised learning network, and use that as the starting point for the genetic algorithm instead of just a random start?

  • Tinker Bear
    Tinker Bear 7 months ago

    Nicely done! Fine tuning performance is a question of very fine nuances, you should give it more data if you want it to perform better; if you give them data every like 10 deg rather than only 0, +/-15, +/-30, +/- 60 deg...

  • mistero king
    mistero king 7 months ago

    What about using both? First, use supervised learning in order to make it more efficent then give thouse datas to genetic algorithm and thus genetic algorithm may try to evolve on the knowledge that is already given to it. I think, this way will save a lot of time because it will start with enough information about the game and it's enviroment. It may also improve development and conculution in hundered generations. These are my thoughts and I don't really know much about machine learning.

  • Benjikrafter
    Benjikrafter Year ago +186

    There’s a bunch of ways to “outsmart” genetic algorithms by teaching certain skills you previously knew were important before teaching the primary goal. For example, teaching not to hit walls or turning as little as possible. Using this you’ll have a more refined base AI to learn the track. I’d love to see you try something like this again using this method!

    • Peter Pike
      Peter Pike 7 months ago

      @Elzig Von Schnitzel -- "Rubbing is racing." :-) But in reality, the only reason you don't hit walls and try not to hit other cars in real life is because the damage that you get will make you go slower. If it made you faster, it would be the strategy. And you don't know in a video game what effect it will have until you try it.

    • Elzig Von Schnitzel
      Elzig Von Schnitzel 7 months ago

      @BillyViBritannia although I very much agree with what you said... I love racing in general, and you don't race by hitting walls or other cars.

    • Psych Engel
      Psych Engel 8 months ago +4

      One major flaw in his AI was the fact, that they didn't know how to show down or use brakes, that's why they bump into the walls and drive like a drunken guy...

    • Benjikrafter
      Benjikrafter 9 months ago +3

      @BillyViBritannia I actually really like this response. I completely neglected the possibilities an AI adds for the sake of speeding up the process.
      I still think a starting point is very useful in many cases such as I explained, but that could very easily not give the best outcome, but rather a decent outcome sooner.
      I have to concede that my approach aims for a good AI fast, rather than a ‘perfect’ AI that mazimizes the task.

    • BillyViBritannia
      BillyViBritannia 9 months ago +9

      I disagree. The power of AI for me comes from the ability to find solutions to problems that humans can't find. Part of this could include a physics glich where the AI rams into a wall on purpose just to be launched through the checkpoint faster.
      We will never reach those possibly desirable solutions by limiting the AI to our imagination.

      Now if you just want a boring AI to automate things and don't care about the best possible solution then it's fine I guess.

  • Cyrille Tessot
    Cyrille Tessot 3 months ago

    is there a way to combine genetic and supervised learning? maybe you start with some data collected but enable some ramdom neuron network and collect data between each run to improve the supervised learning

  • Andy Mills
    Andy Mills 7 months ago

    I wonder how it would work if you used other benchmarks, like preferring cars that didn't hit the wall (circle strategy!). I'd actually like to see how that would work a lot; make the walls lava, once you get a fair number making it through. Any wall hitters die off. Maybe if you added an element where there's "male" and "female" drivers and the pool of drivers is restored by a random pairing of the male and female finishers.

  • Aulne
    Aulne 7 months ago

    L'approche du reinforcement learning est assez différente de celle des genetic algorithms, tu confonds les deux à tord.
    Ton approche utilise effectivement un algo génétique. En reinforcement learning tu dois définir par exemple une reward function et l'objectif est de maximiser la somme des rewards sur l'ensemble d'une course. Il n'y a pas de concept de mutation de gènes et d'hérédité, le réseau de neurone est optimisé juste optimisé pour maximiser les rewards.
    Excellent vidéo pour le reste, je salue le travail

  • Marcos Martínez García

    There are some cars that are faster until they reach the corner, where others overtake the first ones by turning better. Is it possible to combine the best them?