From the course: AWS Certified Machine Learning - Specialty (MLS-C01) Cert Prep: 1 Data Engineering

Use SageMaker Studio Lab

- [Instructor] Amazon SageMaker Studio Lab is a free environment for prototyping out machine learning projects. It's based on Jupyter Notebook Lab. There are a couple different options. You can see here, there are CPU options as well as GPU. The GPU option is one of the more compelling for doing modern machine learning workflows, especially things around pre-trained models. If you go through here and take a look, there is a course that you can follow, AWS Machine Learning University as well as resources like a Hugging Face integration here. And you can see if you select Open Hugging Face Notebooks, it'll prompt you to copy this project into your SageMaker Studio Lab environment. I've already got that running. And so if I go over here and I take a look at SageMaker Studio Lab, I can go through very quickly all of the resources that are copied over from the tutorials and also dive into some of the other features. So let's go ahead and first look at the features of SageMaker Studio Lab here. On the left you have the file system. If I want to just browse the file system here, you can see everything that I've got. I've got a bunch of directories with notebooks in them. If I select this icon, what this does is it shows me all the available notebooks that are running and terminals that are running. You can see here's a terminal right here. If I go to this icon, it says this is the Git integration utility. And so you can actually go through here and integrate with Git. I also can look at the table of contents of Jupyter Notebooks, and then enable experimental features over here. Now let's take a look really quick at this terminal. One of the things that can be useful is if you wanted to look at your Conda environment list here, if I typed in conda env list, you can see all of the different environments. If I needed to either create my own or toggle between different environments, I could do that right here. Another thing that I could do from here is monitor the GPU. This is very helpful if you're doing, let's say, local Hugging Face transfer learning, for example, or doing some kind of a local machine learning job. How would I actually do this? So if you type in nvidia-smi -l1, it'll actually cycle through and monitor what's happening with this local Nvidia driver as well as the chip. And you can see here that the driver version is listed, the CUDA version. It shows you how much of the GPU's being utilized. In this case, I'm not doing any GPU utilization. And then also it tells me the type of GPU, in this case a Tesla T4, which is a nice high powered GPU. So I could leave this running if I wanted to and actually take a look at all of the different things that are happening on my GPU. Now the other thing that we can do, now that we know a little bit about the structure, is go back again to the file system. And if we look at the notebooks, there's a bunch of different notebooks here. I'm going to go to the Hugging Face course, and the Hugging Face course in English. You can see there's different chapters here. So this is corresponding to the O'Reilly book around Hugging Face. And if I go to chapter one, I can go to section three here and just open it up. Now what's nice about this particular notebook is it really kicks the tires on the concept of what a transformer is and allows me to play around with it inside of a curated environment. So the first thing that we can do is say, pip install datasets evaluate transformers, right here. And this would go through and install the transformers library from Hugging Face. Once we've got that, we're ready to actually start doing high level things including doing sentiment analysis. And this pipeline pulls in the pre-trained models from Hugging Face and lets our GPU based environment run these, which is really nice. And so in this case it says, "No model was supplied, defaulting," to a particular model that's available on Hugging Face. In fact, if we want to go and take a look at these particular models, we can actually see this. If I go to Hugging Face, and we go ahead and we take a look at Models, and I go to a particular category for NLP, for example, and I look at something inside of here, like summarization, I could go through here and pull up, for example, this model right here, and even play around with it. But because this environment is using the transformers pipeline, I can actually pull it right into some code and play around with it. In this case, if we go through here and we classify it, it goes ahead and does that classification. We also could do zero shot classification, which is pretty nice, because it allows us to pick some particular candidate label, and then go ahead and select what it is that's actually most correlated with. In this case, it says, "This is a course about the transformers library." You would think, they would probably pick, you know, education as the category. In fact, it does. And it shows it's most likely this label. We can also do text generation, right, which is kind of cool. So you can actually use this to generate text and in this case it says right here, "In this course, I will teach you to create effective WordPress admin files." Right? Another one we could do is we could use another model like distilgpt2, which is also kind of neat. We also could do masking, right? So if you go ahead and you put mask right here, you're going to say fill mask and it'll actually auto complete in something as well. So all of these are nice because it can use this local GPU and also lets you play around with the library. So you get this stuff working locally inside of SageMaker Studio Lab, and then later you can push this maybe into your AWS account as well. Now you also, if you needed to, you can actually go through here and you can type in aws s3, for example, and play around with potentially some S3 command if you've got that, the credentials set up. You can do all kinds of other things as well by looking at the help menu here. And you can see here that the latest version of the CLI is now stable and recommended, right? I've got this AWS integration as well. So there's a lot of really interesting features that are available with SageMaker Studio Lab that are designed to really push you towards machine learning. And it's a great experimental environment for doing machine learning on the AWS platform.

Contents