AI Rebels
The AI Rebels Podcast is dedicated to exploring and documenting the grassroots of the current AI revolution. Every week a new episode is posted wherein the hosts interview entrepreneurs and developers working on the cutting edge. Tune in to benefit from their insight.
AI Rebels
From RAG to Resilience: Building AI-Ready Data Foundations with Jim Liddle of Nasuni
What’s the one thing every enterprise overlooks when building AI? According to Jim Liddle, Chief Innovation Officer at Nasuni, it’s not the model, it’s the data architecture. In this episode, Jim explains why unstructured data is both the biggest risk and the biggest opportunity for enterprise AI. He highlights the persistent compliance gap around sensitive information, the challenge of fragmented storage, and the dangers of agentic autonomy without safeguards. Resilience takes center stage, with a focus on immutable snapshots and instant recovery to keep systems reliable. Jim also shows how agent orchestration works in practice, from splitting prompts to managing context windows. Looking ahead, he shares how Nasuni’s FileIQ and OpsIQ can help organizations build AI-ready data foundations without drowning in tool sprawl.
https://www.nasuni.com/
https://www.linkedin.com/in/jimliddle
hello everybody and welcome again to another episode of the AI Rebels podcast as always I am your co host Spencer and I'm your co host Jacob and we are very excited to welcome Jim Little on the show with us today thanks for coming on Jim no problem happy to be here Jim is he's a CEO of Nasuni which is the company I actually recently joined so I have been very much looking forward to this I've heard your name Jim since I started at Nasuni everyone talks about how you're forward looking you're so involved in the AI space I've heard things like thought leader thrown around which is always impressive big one yeah yeah don't don't believe don't believe everything you hear that's all I would say alright I I I believe it though I believe it um Jim tell us a little bit about your yourself your journey to where you are now in the AI space we'd love to get a little little taste for where you've been sure so um yeah I'll I'll tell you a little bit about the journey cause I think the journey is kind of interesting especially for folks who are a little bit younger than me um so I'm originally from the North East of England I think as I said to you two guys um before we start the podcast I actually happen to be in the north of North East of England right now for the first time for a little while cause my parents are still here but I now live in London I guess my my journey I've been in the industry about 30 years um and I've been predominantly in in the it industry for 30 years so I I actually started off um way back in the day doing kind of system admin work um now what we would call Devops I guess that was with kind of Novell Networks Novell Directory Services Sun Networks you know Unix Ultrix so um Windows NT 3 5 1 as it was back then so it was really administering a heterogeneous system and also doing like some interesting other things um which included implementing a a cross company digital telephone network putting in fibre optic cables as a backbone so that we could all have kind of Ethernet which it wasn't at the time to the desk doing some early implementations of Wi-Fi cause Wi-Fi wasn't a big thing you know back then and I'm also doing um FTP door stacks so we could open the doors with our cards and wow kind of a early implementation of kind of internet of things type of thing okay so that was kind of where I originally started from and you know from there I worked for you know quite a few tech companies a lot of American startups a lot of European startups I think back in late 90s early 2 I I was working for a company that did you know business process and business rules um which was you know kind of doing um I guess some early forays into you know what we see today's um business process automation and agents except it was it was more business rules and business process in in Yammer but obviously not autonomous but trying to achieve kind of the same things we're trying to achieve today which is to be able to automate discrete business processes inside of an enterprise and and back then you know we were describing business processes in WFML and we were actually doing the business process language um in another you know ML and the idea being that you could map the business processes out end to end and then decompose them until you got to Atomics and start to implement business rules and then actually run those in an engine end to end so it should sound kind of familiar if you're familiar with agents and you know yeah except the difference is of course that was all static implementations and today it's all you know potentially all autonomous however it does kind of smack of the industry trying to solve that problem for a long long time yeah and definitely and we kind of you know here we are 25 years later trying to attack the same thing I think from from actually working for some of these these companies I work for another big data company coming out of that I actually had a you know I've had a couple attempts at startups to be honest with you this this one another one that was bought by Nasuni was my third startup I guess the you could call it the successful startup and that was you know it was an interesting experience you know for me because um it's not just been involved in technology it's you know raising the capital you know working with investors and in that particular startup I had I've been CEO CTO VP sales chairman you do you wear a lot of hats so my career is you know it's been a a long career and a lot of it has been based around data mostly unstructured data to be honest with you rather than structured data although the work I was doing with the big data company in Hadoop was was more around the structured data sets and then and now as you said Nasuni I'm what termed the chief innovation officer so that role basically has me you know looking at um technologies that may be applicable to what customers want to do with their data but also working with customers on you know what their data strategies are as it pertains the sooner and unstructured data as they kind of start to investigate AI it's so true what you said about like you know we're still trying to solve the same problems that we have been for forever my own dad had a a data startup as well but his was focused on again unstructured data but specifically unstructured data in the context of like compliance right and it's it's just always still a big problem today no exactly I was gonna say it's like it's always shocking to me just how you know maybe not shocking anymore but it it's insane how deep the problem goes and how hard of a problem it is I was talking um I think I said to you both earlier I've been on a um on on a couple of podcast this week and one of them was actually talking about compliance and the fact that interesting why um why have we never solved the compliance problem with unstructured data particularly when things like GDPR and CCPA and hipaa and all of these like compliance regimes would make you think that compliance and governance with unstructured data you know we've done that solved it it's been put in a box and then you know we come to AI and then we hit the thorny issue of Phi and Pai and in data sets that you would kind of expect not to be there by now right yeah yeah yeah I think I mean this just touches on also the bigger problem that I think we're seeing so much we've had guests on from all walks from start up to very successful and everything in between single person entrepreneurs single person entrepreneurs and honestly it's data is almost the hardest part of the AI and the integration like they can build an AI tool yeah they can they can spin up an AI tool all day long it's just how do we then digest the data and actually use it totally true I think um and there's different elements obviously of working with unstructured data I think um one of the elements is obviously you've got the metadata pieces of unstructured data and then you've got the content pieces of unstructured data and then um you've got the schema formats of the metadata and the content pieces of unstructured data and some of those schema formats are understood and but some of the schema formats are understood but proprietary and actually non consumable by AI tools today and that becomes problematic for the enterprise because not all of their datasets in fact you know maybe not even half the datasets are actually in just unstructured textual data they're in images which is kind of getting better because now we got multimodality and they know that pieces becoming better but only handedly in the last 12 months and even then probably really in the last six months and then we've got like proprietary data sets that they use a lot of you know Adobe Premiere um right you know all of the AC the rivets and the order cards and you know where you it's very difficult to get the the data out of those because the the actually metadata schemes themselves are proprietary to the companies um yeah and the LLM tools obviously you know they don't take that data out and and that's problematic because when you start to look at it in an industry perspective you alright so you've got some text documents you got some images but they go hand in hand with this other data that's in the same directory and actually just as we always say with the I data you need the context of the entire dataset to get context so if you're only taking a couple of pieces of it and you can't read the data from the other pieces you know then that itself is giving you a shadow of the other data and then you've got scale scale is a big issue with data if you start to look at retrieval augmented generation I see so many retrieval augmented generation kind of examples at like 10,000 files you know yeah 5,000 files 100,000 files okay what about a billion files right what about 100 million files because the customers we have inside of um Nasuni you know that doesn't that doesn't even touch the sides you know when you're dealing with multi petabytes you're dealing with millions to billions of files and and actually anybody who's listened to this who's implemented any rag architectures you know whatever flavor of rag it is whether it's I would say normalized rag knowledge rag agenic rag they will know the pain points that you have to go through to get qualitative answers from large rag datasets and we still candidly haven't nailed that today I was gonna say that I've seen some some studies and and some uh you know personal reports from people I know that they have started just moving towards just stuffing everything into context and then asking the question haha cause they find that even that you know if you know if you see some of the recent studies even putting everything into context which actually is very difficult to do anyway because right exactly they they has to be uh yeah you cause at some point the date is gonna be bigger than the context window even if you do do that you will see that there's a sliding scale um you might have seen some of these studies already that yeah there's a sliding scale when the context window gets to a certain size that actually the quality starts to go down again right exactly yeah and then there's also the problem of of within the context it will sort of selectively forget parts of it um yeah in the process of inference that's a that's another result that I've seen and it's yeah I I don't know what's gonna happen to solve it but well I think the um I think what I'm starting to see um yeah is a gentic rag actually if you think about a gentic rag and you think about some of the issues you have with you know I would say vanilla rag some of the issues are that obviously when you're doing the embeddings and the vectorization and the the chunking of the data everybody knows everybody's used rag knows that depending on size the size of those chunks and the metadata attached kinda depends on the quality you get so people start to have strategies of re ranking reducing the size of the chunks looking at the context windows etcetera but still if the dataset is large you could end up with 10,000 matches 50,000 matches all kinda equally ranked and even if you run them through a reranker you might still end up with 10,000 matches however if you look at what I'm starting to see at least with some of the uh agenic rag examples we've seen and particularly when you see people like you know actually a lot of the um foundational vendors are already doing this underneath the hood what they're actually doing is they're taking your prompt and then they have a kind of an orchestrated agent if you like and then underneath the hood they're splitting the prompt into parts and they're farming it out to some of those different agents interesting and and that's interesting because each of those agents has its own context window yeah so you can start to see you know that depending on what the input prompt is depending on what the data set from a rag perspective is in the background you could actually start to see that well actually I might have 10 agents but they're all targeted in a specific section of the dataset which is gonna give me better quality of results you know when it's been you know vectorized and bedded chunked and brought back into so I start to see some of the patterns like that and they look interesting actually yeah an agentic approach is interesting to me because it starts to it's funny how often I feel like we say this every episode but the best practices we see in organizations and digesting data and using data like you need a team you need a team with specializations and how to identify various trends in the day things like this and slowly AI has morphed to become OK now agent let's use agents to build a team and people are just blown away that we that oh my gosh an agentic approach this is it's like this is the same thing we're doing we're just now applying it to AI to these agents it's interesting cause you've just brought to my mind something I read on Reddit today so the agents um the agents subreddit on Reddit it's always good yeah it's good and then I read somewhere today that somebody had somebody's got a solo startup solo entrepreneur and then they have um all of their employees are agents so they've got a media a media agent who's responsible for posting yes on um Twitter um and Mastodon and all those things and they've got a um another agent who's who writes copy for example and oh my God it just amused me cause they decided they'd kind of shown you from their terminal and they named all the agents as you know worker agents but specifically the the action that they were performing and that was their employee list yeah it's um we had a guest on who talked about this phenomenon and he said it won't be he's like I would not be shocked if too not too long in the future someone was able to come to a company and say hey I have cloned your HR department with agents pay me and I can do your whole HR department for this monthly fee it's like this is it's gonna become this coaching entire departments where you can yeah I think that's true I mean if you look at the way that some of the um the virtual employee um startups are targeting that they're asking you to employ their agents when that's kind of their terminology yeah so not hiring people but hiring agents I would give the caveat around this well although the technology is exciting it's still immature very and then yeah you only need to look at some of the things that have happened even in the last few weeks you probably saw the um you know what allegedly happened happened with Replit and the agent um did you see that oh yeah I did see this one the database got deleted you know etcetera and then the Amazon Q where somebody was able to inject something that the you know the agent you know etcetera so I think that one of the things that it's surfaced is 1 immaturity you know of the of that whole ecosystem because it is relatively still immature and 2 there are new attack services that are starting to or attack vectors that are starting to open up you know not only misbehaving agents and but also you know supply chain attacks and you know like we saw things of that nature what I've been what I've been waiting for is someone to manage to post an LLM on their website in a way that allows arbitrary skip script execution and they're going to manage to commit like a cross site scripting attack against themselves with an LOL oh interesting I'm convinced it's gonna happen and I'm keeping my eye out because I guarantee there's going to be with the rise of of generative uh UI specifically there's gonna be someone who implemented it in a really dumb way and I'm I'm really excited to see who it is hahaha you I guess did did you see the MC MCP attacks through M Chrome yes yes exactly I didn't the ones that Simon Wilson did you read his post about it anyways yeah yeah so basically I'm blanking on the details right now did you maybe you yeah yeah so so basically so what happened I mean most people are running like alarm on MCP's or cloud on MCP's right a lot of the devs are running that on their laptop if you run that on your laptop like you know you're running it on something like port 3,000 and it's open and it's like kind of a web server waiting to get a request if you um somebody actually managed to surreptitiously put in the Chrome plug in a check in the background to see whether anything was listening on port 3,000 and then check to see if it was an MCP file system and then start to give it commands oh my gosh and of course you didn't know anything about this cause it's all happening in the background you know yeah and of course it's just a web server and you're in a browser so it's it can oh my gosh it can leverage it and yeah and the security company you know found several of those that had been you know were actually active in the in the Chrome Web Store interesting oh that's a little scary well and it is and this this brings me back to it brings me back to a topic about um you know obviously dear to my heart because what we do in the sunny which is around AI data resilience you know we talk about ransomware resilience a lot but because of these new attack vectors that are surfacing I honestly think we'll start to see this AI dead resilience term because you it's without a shadow of a doubt you'll see whether it's misbehaving whether it's an error because the way you've written the prompt no whether it's an actual attack you will see that on some of these multi step processes yeah a lot of the data that you'll be leveraging will be from the file system cause that's where your operational data is so if your file system suddenly becomes either directly or inadvertently attacked you're gonna wanna be able to get the data back you know really quick and the reason is it's gonna be locked into like now an agentic business process that's gonna be in your front office or back office value chain so it's just gonna run and at some point you'll have forgotten about it so the point it stops and the data is corrupted you don't wanna be running around like a headless chicken trying to figure out how to get the business process back online and how to get the data back and at that point you really need to be able to go back to the immutable copy and then just flick the pointer get the data get the data back go and figure out what the problem was in the first place but get it back on the line and running yeah I'd love to hear more about uh what you mean exactly by AI data resilience too I guess beyond what we just discussed yeah yeah it means that your um it means that your data that is being used with AI ultimately it's resilient in the sense that your perimeter security and your underlying security at each threat level um is able to provide you with data resilience and date and quick data recovery but obviously there's more than the file system to take into account yeah when you're thinking about that but as I say a large proportion of some of those multi step processes are gonna target operational data and that's gonna be on the file system why is it gonna be operational data because just like any data like yes it's gonna be taking CRM and IP data but often again that example I give earlier around the context of these interlink linking of datasets for relevance a lot of it's only gonna be relevant if you can actually target up to date operational data and a lot of that's gonna reside on the file system so being able to make sure that you can get back to a point in time you know where the data actually was clean um and you wanna be able to do that quickly um at speed um yeah like take a look at what happened with Replit yeah like was was was the guy able to get back to the fact that his production database which had been wiped could he get back to a a copy quickly and if so how quickly that that would be an example of AI data resilience interesting which becomes more and more important as AI becomes more and more autonomous I think well it's the autonomy that scares um I think it is scary and that and and I'm sure you've had you know a lot of people on um on the podcast talking about the guardrails that you have to put in place right you know and I would say you know data resilience is just one of those guardrails mmm hmm along with you know how you construct your prompts etcetera etcetera so how would you I mean obviously you are in this space Nasuni deals with data you have dealt with it for a long time for companies looking to implement and become more AI data resilient what would you tell them how would you how would you do that well I think that that's wrapped into what their data strategy is we we talked about it at the very beginning of the podcast that um actually it's really all about the data AI's AI success is really all about the data and and it's not just about the data it's about the architecture of you know where the data sits and you know one of the issues around that I see with companies and I'll round trip back to your question one of the issues I see is that a lot of companies haven't put a you know I'll give you a classic example if you look at them most enterprise companies they have more than one location in fact they'll probably have several locations now those several locations might be geographically dispersed or they might be just for example within the United States but they'll have several of them and in the past to facilitate people working at those locations they'll have probably had on premise storage cause that's what everybody did whether that's Windows files or whether that's NAS doesn't really matter maybe it's a combination of both cause that's what a lot of companies you know historically have had but those types of infrastructures and architectures were designed for people they weren't designed for AI they were designed to accommodate the fact that data has gravity at certain locations and applications that are on premise need access to the data at speed therefore you put the data where the applications are so we we you see that everywhere and we all recognize that kind of pattern however even pre AI that pattern um had issues you know alright so you're an architecture firm you've got a dataset for a project and you have teams in different locations that specialize in different parts of that project how do you get them all working on the same dataset at the same time when you've got three separate siloed infrastructures from a data perspective and we all know that you know in the past companies have tried to do things like you know Microsoft DFS they've tried to do replication to keep the datasets in sync overnight or you know they've tried to send each other stuff and then all sorts of stuff to try and concurrently collaboratively work on a dataset so that's always been a problem but when you shift it forwards towards AI you see the issue that actually we've got different parts of our project and different parts of our business have the data scattered among seven entities so how do we give AI context you know to make sure that it's looking at the entire dataset when we're asking questions about our business yeah and actually how do we do that so it's kind of easy I don't wanna be having to do that and have to think about how I get all of the data back to one place on some sort of integration project because that's gonna get worse over time and it's something we're gonna have to maintain and then you know getting round tripping back to like data resilience that's kind of all baked in for me you want to implement an architecture that doesn't have any any gaps and what I mean by that is I think when you're dealing with um data if you have to separately back the data up and think about backing the data up then you have a gap in your architecture about how the data is being persisted in the first place and what you do when you kind of change the data so if if you have to think oh well I need to back up all my 7 locations and I need to keep doing that overtime or I've just exceeded my weekend backup window I'm gonna have to do a monthly backup window and maybe do it you know over a bank holiday or something on public holiday something of that nature what you really wanna happen is every time there's a change there's a snapshot and it's immutable and you can get back to it and what you really want is every time somebody you know implements a new order a new change in office a office B and C can see it immediately almost soon as the locks released and so can the AI because it's available to that too and that's the sort of data architecture and particularly from a file data perspective I would argue that's the type of data architecture you you want for 2025 yeah I think that we'll see probably not get specifically but I think that we'll start seeing a lot of a lot more source control excuse me version control software for for things beyond um code that that focus on a on a get style interface um you've been a developer must look at that and think oh we've been doing that for years like that's that's yeah but then I think about like applying like trying to apply that to a you know a git tree to petabytes of data and I'm glad that's not my problem to solve I'm old enough to remember working with multi offices using trying to resolve perforce conflicts so yeah it's just it's it's insane the scale that that is out there as far as data goes yeah and and you know listen you know obviously you know Jacob and I work for Nasuni and and that's one of the fundamental problems we solve which is right which is architect the data in such a way that it actually it's easy to have different data sets be kept in sync between different offices it's easy to get a single name space you don't have to worry about backup because it's all intrinsically built in but whether you use mesuini or whether you use something else my argument would remain the same that's the type of data architecture you really want to underpin your AI strategy because at that point you can really start building you know your AI house with solid foundations right right yeah going back to my my little git example I cannot tell you how many times I've been been coding and and you know been lazy about the way I'm directing the AI and I get into a point where like the entire app breaks and it's so nice to just hit get reset to you know the last commit that I want and all that history is gone and I have a clean slate and I wish for that analogy is no different to changing the pointer to get back to a point in time with your data same analogy yeah exactly and I I I wish for everyone to experience that cause it's it's a beautiful feeling it gives you so much room to experiment too it it it enables not just experimentation with AI but like a more rapid iteration cycle in general for for anyone um and it I agree and actually I think that um there'll be some context there around what we'll see with agentic and agents moving forward if you look at um I don't know if you've you've seen um Manas have you seen the agentic platform menace they released them a very good um paper recently where they talked about the fact that they redesigned their agentic framework several times and one of the things that they've done to get they started to use it because of the context problem we talked about earlier and the issues that that that um comes with they started to use um a shared file system to be able to get long term context um across you know Agenic you know session windows if you like and and that pattern stores um agenic history in Jason files that that it can leverage when it needs to go back and understand and I can see that actually being used quite a lot I can see I can see that the ability to have shared organisation or organisational knowledge across agentic sessions that you can you know go back to and check or re leverage um in a particular location on the file system you know that would be great but but again it gets back to the same point we just talked about how great would it be to go back in time to a specific point of that agentic knowledge if something goes wrong with what you're doing with the agent right yeah it um I think it's essential for AI to move forward I think we have to have these backstops where we can feel comfortable Jim I I'm very curious your answer this question because obviously I mean Nasuni is kind of the foundation for AI right it's the data you care a lot about AI you've you're very involved in this space why is it that you're still at Nasuni obviously I'm I would be shocked if you had not been approached by all this with all this AI hype to be involved in other AI adventures well I think there's a I think there's a few um there's a few reasons for that I mean first of all um you know Nasuni bought the company that I worked on for 10 years so there's a there's a certain you know sense of finishing that journey I'm also in a place where I'm still having fun and exciting and you know the um it's interesting because obviously you would you would envisage that the the sexiest place to be right now would be an AI company a pure play I company right I would argue the the antithesis of that I would say the sexiest place to be right now from an employee perspective is to be in a data company because ultimately ultimately AI companies not the big ones but some of the smaller ones every time we see a new um a new drop of a foundational AI platform we see like the crushing of some of these AI companies that have had to go away it's very difficult to do anything that you can't just do by interacting with AI direct and that makes it very difficult for some of these companies and I think the moat around some of these AI companies is becoming bigger and when you think that the anthropic CEO said recently that to train the next foundation model was going to be kind of $1 billion up it kind of gives you an example of the more it's gonna be to continue to build these large foundational models now that doesn't mean there won't be SLMs and small models there will and there's some really nice ones actually yeah I've come out now the the Google G 3 n is really like the 2B model which is really a 5B model because of the way that they've compressed it but some of those I can see some of those discreetly being used you know in in departments great yeah but um but I think that the bigger AI companies will be whittled down because it's gonna be so expensive to continue to keep that going but the data companies like if you if you look at what Elon Musk said recently he said that all of the data that you can consume to train AI is gone publicly there is no left we've used it all so which is insane it is insane whether it's you know whether that's hyperbole or not I don't know yeah right but if you think that if you believe that's true then that means that you've got two choices you've got synthetic data right or you've got some of this domain data that resides inside of the data companies and and the customers and I think it's exciting to work at a data company so in answer to your question why am I still here it's probably one of the most exciting places to be you know when you're in the age of AI it's yeah totally now I like that answer especially cause you touched on like you know you that that some of these big players are gonna go start getting whittled down and fundamentally I would not wanna be you know a news research or a Mistral or even like anthropic I think might be in trouble cause like they're in a spending war against Google um and that's the exact kind of war that Google's built to win and and and fundamentally what it comes down to is Google has all that data too right like they they're more than just the AI just the the model training they have so much data yeah already they do they do although I saw a recent um statistic today that said that um Claude actually you know released some um some new investigated and it values them really highly I I actually think that um anthropic with Claude obviously Musk's Croc open AI and you know probably Google outside of what we see coming out of China obviously which is disruptive they're probably obviously the top foundational models I do think there's room for you know people like missile in the market and I think there is because some of the you know the some of the European countries are gravitating towards having regional air so I think you will see this and and some of that is because of how the data's been held and what you can do with it and what's going in and the legislation that they put in upright you know the EU AI Data Act you know it's kind of forcing you down a route that if you're you know residing in those countries you might well end up putting your data into a more regional based AI actually if you look at AI um um Trump's recent um AI action plan um one of the things he talked about in that was needing to be able to if your company's working with federal contracts needing to be able to prove that um there's no bias in how you've used AI with no contract now if you think about that anybody who knows anything about AI knows that bias actually it's is inherent in every foundational model because it's been it's been trained on um articles and documents you know that have been created by people who have inherent bias by default so you might even end up with I don't know maybe models that are you know been trained specifically that you have to use if you're working on federal contracts in the future that wouldn't be beyond you know the pale I think there's enough room particularly for some of the smaller vendors particularly when yeah when you start to see some of the chips that are gonna come out that are gonna compete with Nvidia that will mean that potentially some of these models will run reasonably well on chips that don't require GPUs you can kind of see how somebody like Jacob who may not want some of that data to go onto um any public cloud even if it's in you know a tenant how he might use a small model you know to do various things for the finance department for example yeah right right yeah you you touched on something really interesting there towards the end where you said all data has biases which is interesting 1 because it's true but 2 there are organizations that are that try to purge data of biases which to me has always been a funny idea because they're humans trying to purge data of biases but they're determining what's a bias with biases it's like this it's this loop by default it's not an easy thing to do because you know it's this whole you know one person's bias is another person's you know etcetera yeah so exactly I am I think what was um if you look if you dig deep into some of that federal you know that federal action plan I think what it you know going all the way back to the very beginning and I think round tripping around talking about data governance etcetera I think one of the things that it will force vendors to do who are dealing with certain contracts is yeah you know to document you know what LLM they're using document you know what steps they've taken to try and avoid bias document prompts that they put in you know as part of what they did for the contract and have much more solid governance around the whole N to N process and if you look at what's come out of anthropic recently you might have seen it the ability to kind of look underneath the hood and see how the neurons fired to get to the you know what the what how did the not just the chain of thought that you see when it's going through its thinking process but kind of behind the curtain how did the LLM kind of get to make that decision you know and actually track it in real time to see so yeah I can see some of those types of things perhaps you know becoming much more formalized and being leveraged where you have to document end to end how AI was used for this you know very transparent if you like interaction that you have with it which really opens up a whole market I mean my background is in auditing I was at KPMG did auditing and hearing you talk about that I'm like oh that's a huge opportunity market for AI auditors where you have to have the technical knowledge but you also need to be able to interface you need the technical side but also the soft side of understanding the implications of what the LLM is doing and the downstream effects and I don't know what that would look like if it would look like a financial statement audit but it's like a chain of thought LLM audit where you sign off on a project and you say we audited this LLM's output I think at some point auditors are gonna probably I mean if you think about um again back to the action plan somebody somewhere is gonna have to audit what comes back to them right so they're gonna have to do exactly what you've just described and sign it off ultimately and then and actually be interesting I mean I mean pardon we haven't touched about it but the um one of the big things about AI is critical thinking and then and actually you know by default the best critical thinkers are people who've come from a kind of um corporate finance background truthfully because they're trained to think that way and right if you and I'm not talking about context um engineering or prompt engineering that's really something different critical thinking is a just a way of thinking to be able to get to ask the question you know the right way to get the right answer you know and it's the difference between doing something like tell me what extra feature I should add to X product to get some more money to tell me what feature I should add to my product that will have the lowest amount of effort generate the highest amount of ROI and please take into account these competitors who may have done something similar so that we you know something way more detailed yeah you know and and not everyone can do that not every employee can do that and I think what will be interesting over you know over the coming times will be how are we gonna figure out which employees can do that that we wanna hire and those who just you can't teach that in some ways it's mm hmm you'll be hiring people not just for their skills you'll be hiring people for their skills and how they can interact with the AI you've had inside your organisation and I don't think that's too far away to be honest I think I think people will start to get assessment tests you know and yeah it might even be here in some industries yeah I mean what would you say Jim you would look for if we if we set aside the engineering the technical coding all things like that what would you say you said critical thinking is there anything else you would say is crucial to be able to effectively use and apply AI as on an individual basis yeah I mean obviously critical thinking is the number one um I think the number two is to obviously to understand some prompt engineering the difference between let's take for example one of you two going to apply for a job that requires AI and let's take your average 45 to 50 year old who might be going for the same job you know the chances are that going through the door you two probably can dem demonstratively show whoever's on the other end of that interview how how and what you've done with AI tools before and what those AI tools are and which they are which you favour and why you favour them and what experience you've got in them and I think that might be harder for somebody who's going to be older and I think that gap is going to be pretty wide for people coming out of college right now you know think about somebody coming out of college somebody coming out of college today has probably used all sorts of AI on their way through their journey to get qualified right and whether or not they were supposed to whether they were supposed to or not they have gained the skill sets in in that now imagine when they're choosing where to work they're gonna make some choices on where to work dependent on not only the company but also the technology that the company has within it and yeah the feel that they they get through the interview not just the assessment that's going on the other end because they'll want to know that they can leverage you know their um their skill sets and actually it takes me to a little anecdote I often get like just any because of my title inside in the Sony people just fling emails at me and say oh I'm looking for a job or I need this and I saw one you know recently that said I'm a developer and I'm just about to finish you know and college and I'd like to have a development job in AI however if you don't let me use AI during the interview you will only see 60% of what I can do and I thought that was kind of interesting cause I think that's probably where those types of things are going yeah yeah yeah it's um it's definitely enhancing that's this way I view AI currently is it's a ability enhancer yeah there was a you know recently Aaron Levy from box you know wrote a something on LinkedIn that said he can use AI like 20 times a day to go and you know check X y and Z and do a deep research on it so that it's yeah it's all there and ready for them when they you know go and debate it in a meeting you know something that obviously you couldn't do even probably six months ago right yeah hundred percent and I think that uh Jake and I talk about this all the time um it's it's pretty crucial to start talking about this stuff because there's there's a lot of people who don't realize how far it's come already there's a lot of people you know for example who as far as like image generation goes like they're still stuck on the idea that AI can't do hands and you know we're well past that like you know not only can AI do hands but at the same time it can you know go do some financial auditing for you and it's the same model doing both haha um yeah I mean that's been the big that's been the big change and I think recently we saw that with Openai I think the ability to not have separate models you know one for image one for voice one for text and to have them in the the same neural net in the background so that you know you got access across the entire modality that's had a huge change on the quality of what you get at the other end even like you look at open AI open AI could not write text inside of an image to save its life yeah and now I would say like it's high 90 percentile gets it right every single time yeah which is very recent it's like a very recent development yeah really within the last 8 weeks yeah so I'd be interested how at nisuni are you guys leveraging multimodality for any data ingestion or processes yet yeah I mean we've we you know without you know commenting on the internal internal right you know we we do leverage AI in all sorts of different places obviously we've got a marketing and creative team and what is AI really good for for for those sorts of teams actually it's good for ideation yeah yeah the ability to like augment your thought processes um it's perfect for that I would argue that you know when you sit down and you know write an article and I do lock you've probably seen I do lots of articles for the sunny blog what have you I still think that you wanna do that from the heart and you wanna do that as a person you don't want AI to write that for you um you might want AI to check it for you now AI's are very good like if you feed an article in and you ask it to um to check the authenticity of the article to check the fact that you haven't accidentally played your eyes somebody to check the fact that the quality of it is is correct to curate it if you like that's always a very good thing to do you know and you actually my experiences coming out the other end it'll sometimes make some suggestions about things you've missed that can make the article better and then you go back and you kind of you you do it but you do it you know rather than the AI doing it I was gonna make a similar comment which is that um I I find that that it works really really well as just just giving you feedback as though you were you know back in school writing an essay for your teacher you giving the you know you pass the rough draft to your teacher your teacher says this article you know this paragraph sucks um rewrite it to be more impactful right you know oh okay I'll go rewrite it to be more impactful something I that I love doing with AI is I'll you know if if I have a sentence or a phrase that I just can't get quite right myself I'll pass it into you know chat GPT or Claude or whoever I love asking it for you know rephrasings of it because uh huh I usually don't like any of them but they make me think harder about what I dislike and like about my own phrasing what I dislike and like about the AI's rephrasing and then you know I pull the different pieces together myself yup um totally I totally agree there's certainly AI isms in terms of how they phrase things yeah exactly yeah and they always start like um if you ask them to do something they all start saying this in this world of AI innovation yes it's very AI like in terms of how it yeah it's funny as you are more and more familiar with AI it becomes more and more apparent when something is written with AI like this is where they always use like everywhere like like like like all the way through the yeah or the ampersand or the m dash that they always use the m dash yeah I love um well I don't love this I hate when I'm reading back on my own writing I'm like God dang it I sound like an AI sometimes and it's it's stuff that I've been saying for years right but it's like you know for whatever reason that particular turn of phrase has has made its way into AI and now I can't use it ha ha ha it's funny to say that because we have a lady in the sunny who is you know um who trained as a journalist so she knows where syntactically the m dash should be used and she's like and I know it should be there but I'm like should I put it there because people are gonna think that I've used yeah I know I like avoid the m dash now yeah it truly I become a big fan of semi colons let me tell you yeah yeah exactly interesting this chat that we've had here in a short space of time we've touched a gamut of topics on like yeah right across and it just shows you how big and deep and wide because we could have taken any of those topics you couldn't be an expert in all of them truthfully you know to go really deep now with AI you kinda gotta pick which pieces you really wanna be deep in and like kinda like agency for example you know if you wanna get really good with it an agent itself I mean it's split into such a multi discipline you've got model context protocol you know for example you've got all sorts of different like Agent Control Protocol you've got lots of different um mechanisms and frameworks to work with agents and yeah and actually it's kind of hard to be the expert in all of them right yeah yeah it it really is so huge I can't when I talk to friends family whoever about it I I can't stress it enough that this touches everything AI truly will change not will it has and will continue to change every aspect of life I think the closest when we've talked in the past I think the closest I can liken it to is the internet or something like that where it truly has impacted almost every single area of life and that's that's what we're witnessing now yeah I I think it's more like electricity I'm really fond of comparing it to the printing press myself hahaha I mean all of those ones we've just talked about had like kind of a huge effect yeah right the world was never the same after the internet the world was never the same after the printing press the world was never the same after electricity yeah and that's what we're seeing it right now it's true and it's starting to become democratized in certain you know ways that I mean the next big thing I guess will be it's gonna be robotics with AI I mean I think yeah I mean when you see um the the recent one that came out of China the the humanoid G1 that was I think it was retailing for some crazy price like you know 12 and a half thousand dollars or yeah like you know as soon as it start to hit those price points and become available it's just gonna be a a next step up and actually yeah if you looked at I'm young LA Con um from meta his view has always been we'll only really start to see huge advancements in the eye when it's completely interacting with the world around us I love I I just have to say I love you yeah it's development yeah yeah he is good I think it's necessary for AI it's gonna force that issue yeah yeah oh man I'm just it's exciting I can't I can't really get my head around it how much it's changing and how much it will still has to go Jim as we're getting closer to the end I wanted to make sure to ask you specifically with Nasuni probably because I work there and I'm curious what you would say to this where do you how do you see Nasuni in the next three to five years implementing involving using AI to help shape what what they're doing yeah well a lot of that obviously comes from um you know I feed into our product process so you know people like Andres Rodriguez and Nick Burling who's like kind of the SVP product you know they're working really hard on on some of those roadmap items but I would say that we're we're not going to be a general purpose AI company for all the reasons we talked about earlier like all of those companies just get swept up and the mortis too big and actually if you look at the enterprise I'd say that um they're already fed up of vendors forcing AI models on them because they wanna use their enterprise AI models that they're comfortable with within their own tenant or they wanna use private you know one of the two yeah they don't want is this um this plethora of different models that they've been forced by vendors to put data in yeah so yeah I I think one of the things that we're really focused on is how do we enable our customers to leverage AI in in the best way and make it easy and simple for them to do that you know best of breed AI and and you know it's all sorts of different ways we can think about that first of all we wanna be truly the best data foundation for unstructured data and to get towards that journey you've seen us do recent things like you know file IQ and Ops IQ and and we're also looking at you know how a gentic can feel into that no what what is that what will that mean for Nasuni what's gonna be the you know the predefined standard for Agenic we talked earlier about that's still pretty immature I think a lot of the enterprises you know are only some of them are only just getting into rag never mind you know taking a really deep dive into agents so and there's we talked about there's several standards around that so I think you know where it makes sense and is practical and we can make it as transparent as what we talked about with the data resilience you know that's really our aim how do we make it so easy for you to use it and to integrate with AI that actually you don't always know that it's even happening yeah yeah I think if we can achieve that then you know we'll have we'll have achieved something that's actually pretty hard to achieve because even today we have customers who think what we do around them kind of snapshots and moving data around as if it's magic'cause it just happens yeah we want to have that exactly that same effect with what we allow them to do with the AI I think that's very astute because uh something that that I thought about a lot and we've talked about a lot on this podcast um is just the inertia of business in general being an impediment to the adoption of AI um it's not necessarily that people don't want to adopt it it's not necessarily that people are scared of it although those are some you know causes behind that um it's just that businesses are big and it's hard to shift the the the the process of it and the flow of it and you know where it's going and so the the companies that can like you said make it invisible but highly effective are are going to deliver massive yeah it's not like a huge fight to adopt a bunch of excuse me adopt a bunch of new processes business side is just like hey we're switching to this new provider and they're gonna do a bunch of cool stuff for us yeah as I said most most enterprises gravitated towards the hyperscalers you know from an AI perspective that means they have a tenant on on AWS or on a a viewer or or on Google and they're leveraging an LLM within that tenant and that's that's just a service it's rather just like just like popping up anything inside of those tenants they can put it up they know that it's bounded by the security within that tenant and that when they tear it down it's gone when you talk about AI with companies like that and the majority of the enterprise does that what they wanna know is how do I get my data in the Sony talking to these systems over here yeah how do I do it easily so that I can get the best value yep which really helps companies unlock it's it's been this thing forever people are like oh I have three petabytes of data and we use 100 terabytes of it and this I think is that unlock there hopefully right like it all of a sudden allows AI to be involved to help you use the most relevant data say at a bare minimum it probably expands another 30 petabytes yeah I say never forget that um not everything is an AI problem and actually data visibility is not necessarily an AI problem it's a search problem now we're starting to think of search as being kind of um semantic search effective search a conversational interaction but it it's not really it's about indexing your data indexing the content yeah being able to rapidly and quickly get access to that content based on filters of assets and you know I want to know for example any PDF that Jacob talked about AI in the last six weeks and I'll get a list of documents you would never get that query executed on AI AI couldn't really execute that query because just of just because of the way that um we know that AI works in terms of how it right takes data so yeah don't forget data visibility most enterprises today for their two petabytes of data still don't have good data visibility into that yeah right and actually right the way back at the beginning we talked about what do you need to do with data to get to the fact that you've got a good data strategy one of the things is to get visibility into your data sets to know what's gonna be useful for AI and what's not gonna be useful for AI um and to do that you have to be able to see into the data fascinating circular problem ha ha ha yeah it is it's gonna be chicken and an egg problem but you do need to go through the steps what you what you tend to see today in the enterprises people will take the data sets that they know really well and it'll be a very small proportion of those two petabytes we talked about and they'll implement something small like an internal or external facing chatbot but but ultimately over time to get real good um operational insights into their data sets they're gonna have to tackle data governance you know data life cycle curation and visibility you know all of that stuffs you know they'll have to do that across the data set yeah yeah there's a lot of contributing things to really have an effective data and AI strategy totally agree Jim last question for you before we wrap up what advice would you give to the general public someone who's not super technical who wants to understand better understand AI I would say use it I'm a big proponent of actually saying that you actually don't know what you I think you don't really know what it can do unless you start to use it you know day to day and I would say the best thing you can do to get familiar with AI is to use it and the more you use it the more your brain's gonna start thinking oh I could do this with it I could do that with it yeah can it do this and can it do that and and actually that's the best way to start to understand it not only what it can do for you but where the edge cases are and what it can't do and how you gonna interact with it it's the best way mm hmm if you um a good guy who's been doing that for a long time is um Ethan Mollick I don't know whether you follow that guy on LinkedIn yeah Ethan's great but if you um if you're on if you're on the um kind of the LinkedIn platform and I would say go and follow Ethan Molly or go to his Instagram he's great I would say go to his Instagram and because he's always experimenting yeah and he's always posting about it too I love he is he is and I love the way he makes it very easy to approach he does he does and I would say he's a great guy to follow for that sort of stuff yeah and he's Learned by the way by doing exactly what we just said just literally yeah throwing everything at it and see seeing what it can do what it can do yep yeah and I see people another common complaint about AI I see is like oh you know I want AI that does my dishes and you know my laundry don't want my AI to do my art and I cannot stress heavily enough to people like that AI is here it's you know it's more theoretical than that or excuse me abstract than that right like but you know use it at work you can automate the 95% of your job that's not enjoyable and you can focus on the 5% that is um and and deepen your skills I agree make you know make AI do the mundane so that it leaves you available to like think about doing the stuff that you really wanna do totally awesome well Jim thank you so much for coming on we uh I very much both of us but especially I yeah I'm very excited to see excited to see where Nasuni goes and and your role as we move forward Spencer's converted spread the word give me a hat I'll wear it oh it's coming alright Jim we'll definitely stay in contact oh actually real quick if anyone wants to follow you I meant to ask this yeah as well if anyone wants to follow you what's the best way for them to do that yeah they can find me um I'm I'm kind of Jim Little everywhere so they'll get me on LinkedIn as Jim Little they'll get me on like kind of X as Jim Little they'll get me on Blue Sky as Jim Little and Mastodon so um easy easy to find that's easy okay thanks Jim thanks no problem take care bye