RemyxAI pt. 2: Agent Boogaloo ft. Terry and Salma Artwork

AI Rebels

The AI Rebels Podcast is dedicated to exploring and documenting the grassroots of the current AI revolution. Every week a new episode is posted wherein the hosts interview entrepreneurs and developers working on the cutting edge. Tune in to benefit from their insight.

All Episodes

AI Rebels

RemyxAI pt. 2: Agent Boogaloo ft. Terry and Salma

January 05, 2025 • Jacob and Spencer • Season 2 • Episode 22

In this exciting return visit, Terry and Salma from RemyxAI join the AI Rebels Podcast to share the latest on how their platform helps developers navigate the ever-growing world of LLMs (Large Language Models). They discuss the evolution of AI research and why choosing the right model—from language-based to multimodal—is so critical (and can save developers enormous time and resources).

Hear how RemyxAI’s new tools streamline everything from data generation and synthetic data creation to advanced benchmarking, model fine-tuning, and even building entire data pipelines. Terry and Salma offer an inside look at their “mix board” concept, which compares candidate models and automates big chunks of the evaluation process. Plus, they dive into the future of AI development—covering vision-language models, robotics possibilities, and how a single developer can use these powerful toolkits to compete with industry heavyweights. If you want a sneak peek into cutting-edge AI workflows and the thrilling open-source landscape, don’t miss this episode!

0:08

hello everyone and welcome to another episode of the AI Rebels podcast um today we have some guests back from our first season Terry and Salma from Remix AI uh they are here to talk about what they've been up to uh Terry and summer 1st thank you guys so much for coming back on and thank you for for taking time out of your day likewise thank you both you and Jacob and Spencer for hosting us I think last time we had an awesome conversation and it's a pleasure to be able to hop back on and continue the conversation awesome yeah we we are very excited I think this is really what Spencer and I kind of envisioned when we talked about this podcast initially is like let's build a build this network this we call it a family this AI Rebels family and then just like see where it goes cause like people like you guys you're the people that are going to be building the cutting edge AI tools and getting integrated more and more into the AI space so we're really excited to see just what's happened in like nine eight or nine months since we last talked to you guys yeah it's it's been a minute and it's been you know a ton of new things I think popping up um the space over time is maturing I think um there's a lot of common problems that people are starting to converge on or patterns about what kinds of applications work or really interesting and you know we've been learning alongside everyone else too just figuring out what is you know uh this is new challenges that are emerging because of the changes and all the new releases and and we've been basically using it to also guide ourselves in terms of where do we take remix what are the best kinds of problems that we could help developers I love you so much and also tools that will help solve the problems generally that you know whatever they're building applications for how are they helping the business how are they helping either lower their costs or be able to serve more customers or even just reach new heights basically because we have this huge unlock right so it's been super exciting just to learn alongside everyone and see where this is going yeah so I'd be interested in uh hearing why don't we just jump right in I I'd be really interested in hearing um what you guys have been up to uh since we last talked any changes any pivots um really interested you know as we were chatting last time we were starting to think more about like um working working with the problems people were facing and llms and um you know the cost of training the models has uh gone down like um more people are finding smaller models or better out of the box but uh they're still like things that they wanna do to try to get the best performance for their application so like uh since since we were chatting last we've like been doing more with synthetic data and more with LLM fine tuning multimodal models have been like pretty big this year nothing experimenting with use cases for specializing those models so this like open source projects we've done around that and I would say like this year uh we've really taken the studio like that application that we were showing before as like a uh like a test ground for uh what things were interesting and useful for developers yes trying to like now double down on the uh most useful workflows and like building up the uh CLI the APIs for doing that so thinking what people were sort of experimenting with through our app and like tryna build out um the the code for like infrastructures code for people to like um build this more until hmm that's awesome we basically you know stuck through and true to the kind of a platform or workbench or studios Terry described a place where you have a collection of tools to help it's it's leaning more and more into kind of foundation models so this kind of new variant of models right that may be a little different than classical deep learning model architectures or how models work or even classical machine learning where we have this new paradigm these models that have distilled essentially the internet down into their weights they are out of the box can do a ton of things really well and very impressive on their own um but we've been finding it even an application is that folks are able to take these models and create a quick POC and validate you know the particular application structure as possible or that they sit but where people are now struggling is to get to that last 20 10% improvement performance to actually get that MVP down to something that can be put into production yeah and then improved upon overtime and you know we've been kind of exploring deeper into that space into that gap how do we help folks figure out uh what tools what mechanisms techniques methods ha ha and all the stuff that's been releasing open source to inch away over time improve upon their initial MVP to a point where they have now many possible paths to explore on improving that application or even making new applications with those ideas it's awesome especially I think that's kind of resonated with people was um the the burden of choosing a model um so every week right there's like new models that are coming out right and some of them have challenges that will make it harder to work with so people are kind of constantly facing the is this model worth it uh hmm interesting basically to choose from uh oh my God like are they well documented as I said like maybe you know some models have problems overfeiting benchmarks or or different things like that what what uh tools or um what uh I guess utilities are there out there to help recommend the best model in like shouldn't that be based on the context of your application right mm hmm there's like practical constraints that everybody's facing like is this model small enough to be fast is it big enough to be good is it been trained on data interesting is it better at math and so uh trying to help people uh match models to their use case has been like one of the yeah how do you with that right now I'm envisioning like a dating app you know how do you how do you match people but like like how do you what do you do to to help them figure that out do you have a questionnaire or like what what does that look like yeah love it so the idea is that you know we want to find a best match given the context of your application right one of the simplest easiest context pieces of context you can give is maybe a data sample of what your data mm hmm or if you're gonna be fine to any or something like that or even just like an example of what kind of question or what kind of uh prompt is someone going to feed into your application um and we basically use different probing strategies to understand from the models that you want to compare like as a group comparison right how do those models respond to that initial sample and then we essentially use LLM as judge to rank essential from like you know 1 to 5 or some kind of like ranking mechanism to be able to tell or just enlist those models in the the one that has the best kind of fitting response okay the closest data distribution to what you'll probably find in application is going to be one a model that will probably be the best one to help continue moving forward you'll you'll be spending less time energy being able to reduce the gap between what was previously trained on that model versus what you're going to need to that model to do next so the idea is from that little piece of context can we tell which model has the best data priors meaning probably it's going to be the easiest one to fit into whatever application you're trying to develop okay with that process in the end is it like a a user driven choice of of um you know you feeded the information the data and then the user essentially chooses like hey I like this prompt better excuse me I like this answer better than this answer it's um it's um being judged by the nllm so it's more automated but um you know we're not uh uh fixed to like one specific way of evaluating the models either yeah and I think a lot of people have tried um working with benchmarks and different benchmarks maybe more like fine grained yeah and people are using synthetic data and judges like what we're describing with this uh matching feature but um also there's probably room for more like custom uh judges ones where like the user can um drive more of the evaluation criteria things like that so you could design a rubric eventually that you would feed into the element judge and it would then use your rubric to judge you know I like responses based off of this quality so let's rank all of the models based off of that automatically return the ones that are best fit for that context right through a variety of methods to evaluate like some are better suited mm hmm a good way to quickly like narrow down your search from like dozens of models maybe to half a dozen models is it kind of comparing or optimizing different models maybe it's through fine tuning or maybe it's through like some other application specific process you'll be able to um use other evaluation methods there too hmm more emphasis on automation and in like quicker quick kind of narrowing down the scope of candidate models and then like we want to uh have like a really broad suite of benchmark evaluations too cause they they have a strength and other parts of the workflow like making sure that you didn't lose something after uh training a model right yeah so you know there's different we're we're also just kind of concluding over time is that evaluation is something that probably needs to occur at many steps yeah the matching describing like the model section yeah in terms of starting a new project or even visiting how to update a new bra an existing project right but as you move through the process you might use benchmarks to then post training understand if the um quality of the of the certain skills that were out of the box in the base model are still present or if they've been altered in any way due to the fine tuning and then be able to tell that later on when you have that model deployed and it's being used in in your application you probably want to understand how users are responding to the output of that model right so be a different kind of evaluation um that will be also in context with all the other kind of measurements that you took along the way to then at the end of all of that entire process you have a better idea on these are the directions that I need to go and explore to help improve upon the Model X y or Z way that's awesome um question on that so with the with the initial evaluation and choice uh is it is it so does the user kinda choose like hey like I wanna look at language models here's my task um so let's say like like oh yeah don't be sweet I'm curious like what happens like say like a customer comes in and like I think I need a language model and then you know from their input it's like not you just need a really simple classifier I don't know right now um the evaluation framework is kind of like mm hmm got it I'm just right yeah the layer that we wanna add on top of that studio that will help make recommendations out of all those little data points so all of the measurements that you took the context that you gave us about what you're building the agent over time will learn alongside with you on okay maybe we actually want to approach the problem this way we could uh design the application this way or with the current design of of the model we might wanna explore like adding this data because users responded with these prompts that we don't actually have in the data distribution let's go ahead and like kick off data synthesis or data curation jobs um that will be kind of 6 6 so right now we've we've been trying to build up tools build up tools to unblock developers immediately nice coach we'll use that agent to kind of learn on top and then also learn alongside the developers to help them that that kind of opinionated automation that you describe it's like um uh they're in some limited way right now when you interact with the agent um like if if we were looking at the different tools someone said like more of the tools that we've made recently or like focus around elements feelings like these are larger foundation models but what you're describing about like mending somebody try burp for a problem instead of a llama that's like just a tool away from uh creating that and then basically instantiating the agent with um those opinionated kind of instructions to to do that um we think we think the elements with tools is a good way to hmm hmm hmm and at the same time like kind of keep things easier than working with like a giant config or something like that yeah making it easier for people to just quickly get something done yes I think also um we think that um the agent could be something that could help organizations at large and individual developers kind of work with an AI strategy that grows with them over time so if you if we kind of look at the last 10 years the most successful companies that have put out machine learning into production you think of like some of the like Netflix that's recommending movies and it's got you hooked on right they have you know over the the years they've been able to build out like um all these work benches that they have specialized they've been able to design the metrics that help them understand how to better recommend new series and movies for you to keep watching uh and that's something you know that takes a lot of effort damn and we're hoping that um through something like an agent that like LLMs with tools idea uh that we could help people build up that intelligence so that you can more quickly get to the point of your you're finding which metrics actually matter for improving your application like if you're building a Jackpot um it's actually responding in the way that users are using it in a in a positive way or it's it's helping improve some KPI that you've designed right without having to have gone through the 10 years of of building that from scratch and maintaining them from that point Onward it's just kind of more focusing on can we help people um have that intelligence humming in the background hmm yes sir ha ha right right before you even get to like the meta problem and that's another point you actually was just thinking about how you know code generation is becoming a lot more interesting of a space and yeah and so with that being comatized we're thinking that you know it won't be the primary function of an ML engineer and AI engineer or data scientist whoever is building applications won't necessarily be primarily you know the the code generation itself we think that there's gonna be tools to help uh you know candle integrations would be able to create you like a quick template on how do you train this thing who's this and the idea is that we're going to rely more and more on the engineers expertise experience context creativity on how is it that they want to solve problems and then uh basically learn alongside the system to help improve upon that overtime I guess like when what we've been thinking more about is like the experimentation workflow that um goes from the the baseline kind of these zero um MVP of a project that you have and getting those improvements consistently consistently making improvements with the changes that you're experimenting with like what are the um parts that what are the dimensions I guess that people really like to experiment with and um a lot hmm the easiest experiment to carry out is like is it model because making models is expensive but um how can you how can you help people through that problem that they're facing when I say yeah investing in a V2 v 3 and so forth uh huh interesting and see the what I pulled up a real quick is just like our dogs yeah like a collection of wolves and then eventually you know the studio or or workbench that has that intelligence baked in to help folks design and then improve upon their applications so right now we've been super focused on um how do we make all of the um you know easy to access yes how do we make that easier to work with in with all the other tools that you might regularly use hmm to do some like data wrangling or model training we want to have tools that work alongside I know and the terminal or with infrastructures code to help you basically move that process along so so if all y'all out there want to check this out and try some of these out we have some new dogs and also some tutorials on how to use some of the evaluation frameworks to specifically that we've done and um this is basically the idea of what we're building towards is abstractions like what we call a mix board so the idea of you have many models that you want to look at from different facets right with different kinds of evaluations like the model recommendation based off a Lego prompt or the evaluation on the benchmarks on how are these different models performing on common benchmarks that you see out in the leaderboards or even eventually like your your own custom rule break and you want to evaluate how models respond on specific prompts or questions and then evaluate models in that way so we've been building abstractions to help people basically compare a bunch of candidates so for example hmm it's you and through either the APIs or front end you can go ahead and try to understand like how are these models ranking so for example with the mixed match we returned a rank list on for the specific kind of utterance or context it's okay smaller models were actually doing really well the instruct variant seemed to be performing relatively well so if you wanted to explore some models you might like look at the ones in the top um and over time we're exposing more ideas in terms of how you might look at models in different lights right that's super cool wow I was gonna say so um if I understand it right now there's the model selection where someone's able to come they're able to use the agent to create an app an AI app and they're able to use the model selection to determine the best LM for their app and make that selection is that right and so um you know when we when we have somebody taking that list of maybe a dozen or or more down to like the list of two or three it's like quite easier for them to um keep the cost in the time to experimenting down and um and make it more like iterative using excuse me yes then maybe think about things like fine tuning or ways you can optimize the model further yeah so your visual representation is sick um you know in this in this view you'd be able to with the utterance or even with the other ways that we wanna evaluate um presenting the the rank or the score of how these different models that you wanted to compare in this case we only wanted to compare to and the idea is that all right we found you a signal here's a recommendation why don't we go ahead and trade this model we'd help you piece together all the other things like maybe help you evaluate how you might need to design your dataset for this particular prompt utterance hmm they said that you need to explode out maybe you need to find other samples real samples in open source or you explode out synthetically we could help with that and then also recommend and recommend ways to train and then eventually support that model so you can test from a human if someone were to ask something like this does this model perform the way that you expected to and be able to capture that information and feed it back into the system over time so be curious to hear how do you guys evaluate the the rankings themselves um yes for you know you need to kiss yeah we've uh we've experimented with some of the techniques um for doing this and it's like part of the problem that we wanna own and like close the gaps on I don't know why uh you might do empirically but it might cost a lot more and so for us like um this this application uh that we're building would give us visibility to try to like own that own that uh hmm yeah you know it's um it's an early service so it's not something that uh I could say like hey haha yeah and it's it's yeah we are advocating that people use a variety of unrelated techniques and it's like I've seen a bunch of different advice on evaluation but generally speaking if good morning and you're seeing like a general trend right is it in general one model is like dominating on on a few different um mm hmm and so we're we're trying to build like the um framework that will allow us to uh track that information and really like close in on that prediction problem that's awesome that's fascinating and you yeah yeah and I feel like everyone is trying to figure out how to deal with this this problem and everyone has their own approach yeah um so you identified this through just observance of the industry or is this from like actual users of the agent that you noticed this problem well so I would say like whenever we read a paper for example like they always scope down the uh world of candidate models down to like yeah the benchmarks or maybe it's like the eight models that have been trained on like domain specific data hmm hmm I see yeah like there's certain models they care about and there's certain evaluation metrics that they care about um most models aren't trying to be good at the most general kinda like AGI benchmarks yeah and once you get into application that scope will definitely narrow down to sick that model to do right and even something it might even until having to do a custom evaluation so maybe a benchmark itself won't be enough that's where we started to think more about how the leaderboard uh could use like some kind of experimental analogue like today or this week I have an experiment on my cue and it's to compare existing model with the next version or some fine tune and make sure family that came out that could be better yeah right and and you know opinionated priors or or what have you you're gonna say I don't care about models in this size and that or or these models are too small for my use case something like that so bringing all of that in and like scoping that down that's where the uh the the uh analogue to the to the leaderboard what we're calling the mix board is like for your experiment for your comparison and so we can focus that comparison and and try to help you find the ways that that list of candidates are uh yeah a number of different lenses so that you can be confident because you know probably there's not just uh one evaluation method that we've come to rely on ha ha and even then they know that over time that might change you might find that in production when users are using your application you thought something was important and then it was totally something different yeah no one cared about yeah right and like feed that back into okay well we need to adjust our priors and we need to prioritize all these other things we didn't even have under a radar it's like not something you're gonna find in um a benchmark dataset that researchers are using it'd be hard to make that even as a custom evaluation like yeah yeah yeah equivalent of the famous story about Apple Computer where you know they went designed the gooey and then the first thing a user did was drag my computer to the trash and destroy the whole thing haha you might want to account for that in yeah haha wow so can you maybe just to summarize not summarize but to help maybe some non some of our less technical listeners could you in just like a couple sentences describe the consequences of choosing the wrong LLM what happens if if that occurs well let's say you've picked the wrong LLM maybe you spent early on a few weeks let's say a month and you provisioned many GPUs which you know maybe your employer right now is really hurting haha yes some pocket of models that seem okay um and you go through the entire process from that point Onward continuing to add more data um train your model deploy that model sustain its life cycle and then have it up and out and basically maybe you might have spent months ideating um and then pouring a bunch of resources into making this one model is it had a better sense of the needs of the application you might have picked the model that has was a little closer to what you were trying to do or generally more um better responding and like at the kind of task you needed to you may maybe have spent a month instead of you know many many months trying to get this thing to work that's a that's a lot of sentences but maybe like yeah no that's good um practically speaking like the switching costs are high for people like they figure out a way to do it they're not gonna change and so if you've picked something that's not that great early on you know it's probably not gonna be the first thing that you uh try to revise when you're getting improvements later this is but if you started with a better base to begin with maybe you aren't going down those rabbit holes maybe it's just kind of better from the beginning yeah it could be maybe to summarize a little bit closer you might have started with a design and then after all of that work and investment you might have found that actually this completely different design and model architecture would work a lot better with less inputs and so you might save yourself a lot of time and resources yeah you're provisioning yeah it makes sense um so is this is there ever any plan to kinda target non developers with this or is it or is the plan always developers developers haha you know when we when we first began it was like it seemed like the direction to go was towards um less developers maybe project managers hmm hmm or prototype the app and then there could be um a way to hand off to like those other teams but um as we've been working through it more it does seem like the people who want to own these workflows are gonna be more developers more uh interested in AI like a little more expert like hard to say that that this belongs um in like a novices workflow for like I made my first AI app yeah I think it's kind of like um to put a point on that it's about who's most interested in invested right it could probably get to a point where a total novice someone who has no context on it could build an AI application but in terms of getting started in and digging deeper who's most invested might be a developer is it who has general context on you know nothing data related in particular could totally do something like this yeah a project manager so someone who's motivated about what application and design and interface could also have a hand in building application um but it might be in the world of people who are generally building or thinking about product where it makes the most sense got it yeah I guess that makes sense because if you think about just like the general you know end user that customers the people non technical users I don't know if a lot of them would even have a desire to build an AI app there there would be some I'm sure they were like oh that'd be cool but yeah yeah right so I and especially right now in the adoption bell curve I feel like we're still in the early adoption here and all the early adopters are like little more technical developers for the most part um we see that uh we see this tool is being most helpful for kind of small teams that um uh don't want all of the overhead of building like evaluation frameworks and like maybe want a fast easy experience with some of these workflows sing yeah yeah so the idea if you're like if you're crazy chemist and you're trying to run an experiment instead of spending all your time you know yeah rather than all the other stuff that it entails I see yeah it kinda makes me feel like you're you're well it it sort of reminds me of like a like Haroku but you know for AI like that's that's that's the vibes I'm getting um Jake for and and for a non technical listeners Haroku is a is was a web hosting platform that had you know like really extensive capabilities a lot of hosted databases for developers nice a lot of the overhead of of managing the infrastructure of an app off of off of your plate yeah totally and so for us like as we're finding that it's still very experimental it still requires a lot of you know trial and error how can we make that process way easier yeah you know um spending too much time on all the other adjacent things so the the Haroku um you know equivalent of today with AI would probably have like an agent helping you stand up some of that stuff yeah agents with tools so that's kind of where we're carrying these uh huh right so it's kind of a more of that helper to help you yeah get through the process we we imagine that like when you're working with this helper and you're getting context about like an application and deployment and like the experimentation history what's worked what hasn't kind of have an assistant in running the next experiment yes future improvements might be so like something to help with your your ideation experimentation to improve the app well some agent creation abilities to remix itself definitely so kind of like long term yeah and we learn from that ourselves we totally can see a future where we have like prepackaged applications that where you kind of feed in the context yeah hahaha you very much will get super super much into eating our own dog food haha the idea of like you could just like we're using it right now in the in the studio and the application using AI to help build the AI you could design even other AI systems well um the point you made earlier right like and when when we when we kind of converge around like what is a good pattern for an agent like that LLM shim layer that uh mm hmm helping people work with the weights like maybe that that uh doesn't change a lot like maybe there's open source frameworks and you know the the basic ideas are giving you a lot of what what you can get out of an agent and so it's like um we're we're kind of uh banking on the the exciting part of the development still being a lot about like modifying the weights and things like that yeah after seeing all the patterns we're going to converge on patterns that work like the agent frameworks hmm yeah change over time are probably those weights and yeah that on that'll entail the that experimentation process still sure um you know do these weights improve things yeah those workflows we think we'll probably get uh a lot more people over the hurdle for making like good consistent changes to their apps yeah I love that I feel like things like this or what are really we've talked a lot I feel like the last several episodes we've talked a lot about this how AI is really enabling just an individual like one person corporations you know someone to come in and create a massive something that can have a massive impact just a single or small team and things like this they are what's essential to be able to compete with these bigger companies you know like if how can you compete with open AI if you don't have something like this where they're continually evaluating their continually adjusting weights they're doing all these things because they can they have the resources but this is what well that's exciting I mean this really is an empowering to the individual the the frontline frontline people yeah we're hoping like a data scientist will be unblocked with like building building up so many things and just iterating experimenting really fast um you know not not being I guess uh uh bogged down with uh um all the different difficulties setting up these environments or learning about new methods so we wanna kind of uh really uh expedite people on on uh the different techniques are available and like yeah yeah you know yeah just as you said like how does someone compete with open AI right and we get glimpses into what they're doing to help improve their models like for example when you get two generations of when you're chatting with LGBT that's like they're running an AB test this is is a great way to get online metrics about how your applications performing and is very essential a lot of the organizations in last 10 years have been relying on that pretty heavily to improve things but historically everybody has been locked out because it's just so complicated it's expensive um yeah right so we're actually at a great interesting inflection point with the AI tools themselves where maybe we can bake those patterns in that context in that experience to the point where that is more automatable that is more within reach for most more folks so that we can't take advantage of those patterns in those ideas that were previously just out of reach definitely love that has there been this is a slight shift I'm curious has there been a release there's so many new players releasing these LMS has there been a specific model or you know in video and whatever the player is has there been a new release in the last several months that you've been really excited about yeah actually um probably you know the the Quinn models for uh vision vision language models have been really good and um on that like Ellen Institute has uh released a fine tune they call Momo so I've been really excited about that one it can like uh point and uh it can understand like spatial relationships from objects in the scene much better than a lot of the other open source before now uh huh yeah and yeah that's um we've been exploring a lot recently with multi model moles because they have also been um more variants that you can um more easily accessible to fine tune or to to try to do do things and so we've also been exploring with an open source like putting together data composing pipelines to help build multimodal training sets that you can introduce new tasks to some of these models and so we've been experimenting with fine tuning lava and hopefully when Momo comes out with some other training code that be something to explore Sydney we've done a bunch so you know that's a super interesting area that we think will also be a place to grow into because eventually you know you won't just be dealing with text like you might just wanna inject also any image context or audio context or chat logs or something else right um to help better improve the task that you're trying to perform for your application and so we can get these models to perform if we can get data sets generating creator designed from the context you have in a way that isn't painful right I think that will be super interesting for a lot of other applications to come soon it's awesome yeah the world of open source AI just keeps getting cooler and cooler like every single day wait what some of the open source image models these days like I've been doing a lot of image model work lately and man like it's it's a little mind blowing how far they've come sometimes it's like a little bit concerning haha yeah but I'm excited hopefully you know we start anyways to make them smaller and faster like um that's the only thing uh that I that I see that is kind of holding it back from doing some really insane stuff with like robotics yeah like yes but um you know probably you can get this to run pretty fast with like some of these Nvidia embedded oh yeah just even like new pruning techniques for some of these multi model models coming out to which is going to help like reduce the size significantly yeah hmm hmm you know make make even things like robotics move a little further ahead than what they've been kind of been stuck at the last few decades right yeah that's awesome yeah in in in light of the multimodal models we've been excited by this is a an open source implementation of a paper that basically describes how to take uh you know a set of images and then be able to transform or extract additional signals from those images using a variety of models interesting to be able to tell spatial relationships so some of these um visual models right of the box may not be able to tell you very well like um the man in the red hat is 3 feet away from the group of other people on the right hand side or you know the the man is taller than boxes so from a variety of models can we extract the information about the spatial relationships between different objects in the scene hello or one of these multi model models it's awesome you know so and so is three feet away or something like that ha ha favorite examples recently haha different relationships like maybe it might say hey um the costume um from the tail or the different parts of of the image is you know this distance and it basically does that from piecing together like Sam and we're using Florence to create bounding boxes of objects and locations and then we're grating using depth extraction we're making point clouds wow haha that then you would feed into your BLM so you feed this this basically this utterance yeah and then you'd be able to train a model to kind of get the idea of estimating you know distances between different objects interesting stuff as in the goal is to train like a excuse me I I I can't think of the right word right now ha ha that does all of these together by synthesizing the data using she was quite amazing that's fascinating 3D reconstruction and we wanna distille that into a model so that with one model you can do everything else you can normally do and this to like better yeah yeah this is something that's so exciting about that the current age of AI is like so many of these we have the methods to kind of combine so many of these techniques hmm hmm that's the idea that's what we're thinking too even you know oh yeah like it's it's not just Ellen's yeah it could be like a whole host or ecosystem or architecture of models red and they kind of work together this is they might differ mostly by data yeah if you can get the base model have like strong priors in you know general capabilities are related then it's not it's not out of the question that you can take that model and like really make it uh specialized or Dalton for your application too so like we're thinking about um foundation models in general well you know where will they be in another year or two a few years will they be doing more robotics domains or will they be used more with like structured tabular data like what are the different ways that we can try to organize um this this tool to support like how you might work with and iterate on apps based on Transformers yeah like the next version of this open source project we're even thinking that we could do is that maybe all of the individual models we use so in this example where you can see the different components we try it out to then get the final uh prompt response at the end could we have an agent be able to say you know from context of am trying to take you know this collection of images and I want to extract these kinds of question answers fares for me to be able to develop this model can we get an agent to understand that needs to find models that can do object tags and captions and uh being able to understand how to segment objects how to get the depth estimation and piece all that together so that the end it produces the data set you need um that's interesting right to even help design Umm hmm besides even all the other things you might do in fine tuning and deployment right that's awesome wow you guys have been busy yeah and a ton of stuff it's been super fun and you know it's like a very much a time to keep your eyes peeled and Umm hmm a host of things popping up left and right and folks running into new problems and uncovering new challenges that we're all gonna have to think about wow yeah yes I've done a ton of work yeah it's it's impressive yeah that's the that's the crazy thing yeah it's just there's so much left to do I mean it's it's wild when you think about what what this is trying like if we just think about AI in general like what we're trying to accomplish with this is sometimes I think we get so caught up in the technology and like we're talking about LLMs and agents and all the stuff it's like if you take a step back we're trying to create intelligence here like we're trying to create something that can think we're trying to create basically a brain like an artificial it's just it's crazy and every single guess we have on it just becomes more and more concrete to me that it's like we're the progress being made is and I it's incredible and I love that it there's so much open source I think I've heard that word so much in a conversation just how important open sources and I think that's so unique about this AI space right now is it it feels like we're all on the same team you know at least for the most part um but everyone's contributing and I it's it's incredible yeah you know it's it's a it's a great time to to be a part of that community and learn alongside everyone I I think that you know more and more um the Pandora's box is opening hmm I think over time we'll all find a way to differentiate like our products and our projects by you know how we design or how we approach a problem yeah the fun or really exciting part of it is that the technology itself and all the learnings and the you know how the tech works and how it can be deployed is becoming much more and more common knowledge yeah and so it's gonna be less about you know do you have to be right to just be able to get to a point where you can get to that creative design application layer and get more people to that point so that we have even more variety of things that we can do yeah I mean I have a I have an agent right now that's posting to Twitter for me and making websites for me it's great ha ha ha yeah exactly ha ha that'd be great well Tyrion summer I'm curious as we wrap this up so this was let's see it was about eight months if you could say six to nine months from now where do you see remix like where were you hoping to be at that point I guess um you know we'll probably be uh focusing more on these uh APIs and like uh uh customers and is possible we could be um you know leaning into uh use cases that we don't uh don't have identified we haven't identified already yeah working with some customers on on uh what's what's um helping them the most yeah I think a primary focus is can we flip all the learnings that we've had so far into here's a tool that your team can use and collaborate together and learn from each other on experiments thank you learn alongside with you to then start giving recommendations at that point we'll probably be doing more with data synthesis for that like it'll be more about integrating these tools that you've seen you know um yes it'll be more integrated to the CLI hmm hmm making this useful for infrastructureist code part of yeah or the experimentation um kind of work flows so part of your existing toolkit and then also being able to um give the ability to create someone with the more complex data synthesis or evaluation past like can we take the V Q a synth idea and make it more of that general idea on can we help you piece together it a custom data pipeline and then um from there and learn how this changes materialize into your application and then improve upon that right so ideas can we flip a lot of the learnings into something that now you use it alongside hugging face mm hmm um and then have that agent start giving you really useful recommendations on what to try next well it's exciting we'll see if it happens or if everything changes between now and then who knows I wouldn't I wouldn't put it besides us no that's awesome alright Jake do you have any more questions I don't think so awesome yeah I've asked all of mine um thank you again for coming on I I I love hearing what you guys are up to and I love following your your Twitter accounts and and seeing everything you post this is awesome um thank you for for having us be part of your community I think as well like all the material you're putting out in the folks that you're talking to it's super interesting and hopefully we can find ourselves here again yeah looking forward to it exactly exactly love it well thank you for coming on we'll jump links to your X accounts and all the all the things for everybody who wants to to follow along and try all this all this out that'd be amazing yeah appreciate it great thanks guys thank you