{"id":4870,"date":"2023-07-30T16:33:48","date_gmt":"2023-07-30T21:33:48","guid":{"rendered":"https:\/\/dev.iachieved.it\/iachievedit\/?p=4870"},"modified":"2023-07-30T16:54:41","modified_gmt":"2023-07-30T21:54:41","slug":"developing-on-a-mac-python-and-machine-learning-part-i","status":"publish","type":"post","link":"https:\/\/dev.iachieved.it\/iachievedit\/developing-on-a-mac-python-and-machine-learning-part-i\/","title":{"rendered":"Developing on a Mac &#8211; Python and Machine Learning &#8211; Part I"},"content":{"rendered":"<p>I wrote <a href=\"https:\/\/dev.iachieved.it\/iachievedit\/developing-on-a-mac-part-i\/\">Part I<\/a> of the <i>Developing on a Mac<\/i> series to provide a foundation upon which to build a collection of minimal guides to developing software on a Mac. In this post let&#8217;s look at what needs to be installed on your Mac for delving into Machine Learning with Python.  If you haven&#8217;t read <a href=\"https:\/\/dev.iachieved.it\/iachievedit\/developing-on-a-mac-part-i\/\">Part I<\/a>, make sure you do and <i>at least<\/i> install the macOS <b>Developer Tools<\/b>.<\/p>\n<p>I&#8217;ve used macOS <i>Monterrey<\/i> and <I>Sonoma<\/I> to develop and test the instructions in this post, but they should apply to Ventura as well.<\/p>\n<h2>Python and Virtual Environments<\/h2>\n<p>Python is an ideal language to begin exploring machine learning.  But, like its programming language cousins, it is <I>real easy<\/I> to get wrapped around the axle with maintaining multiple versions of interpreters and libraries and sorting out conflicts.  Fortunately we can use the Python module <a href=\"https:\/\/docs.python.org\/3\/library\/venv.html\">`venv`<\/a>.<\/p>\n<p>Some muscle memory will come in handy here, and I recommend you memorize the following:<\/p>\n<pre class=\"lang:default decode:true \" >\r\nmkdir my_project\r\ncd my_project\r\npython3 -m venv venv\r\nsource venv\/bin\/activate\r\npip3 install -r requirements.txt\r\nrehash\r\n<\/pre>\n<p>Let&#8217;s look at each line in detail.  First, we&#8217;re going to create a directory where we&#8217;ll be &#8220;doing our work&#8221; or &#8220;our project&#8221;.  The &#8220;magic incantation&#8221; here is `python3 -m venv venv` which <i>creates a Python3 virtual environment in the directory `venv`<\/i>.  Now, you could name `venv` (the second one) whatever you like.  For example, `python3 -m venv my_virtual_environment`.<\/p>\n<p>Once your virtual environment is created, <i>activate it<\/i> with `source venv\/bin\/activate`.  If you named your virtual environment `my_virtual_environment` you&#8217;d execute `source my_virtual_environment\/bin\/activate`.<\/p>\n<p>Once your environment is activated, install required libraries with `pip3 install -r requirements.txt`.  `requirements.txt` is an actual file you&#8217;ll list your dependencies in; we&#8217;ll get to that in a moment.<\/p>\n<p>Finally, we execute the shell built-in `rehash` to rebuild the hash table used to look up the location of binaries.  <i>This is important for us<\/i> because when we begin installed Python modules that have binaries associated with them (such as `jupyter`) we want to use the virtual environment path, and not something like, say, Homebrew.<\/p>\n<h2>Project Dependencies<\/h2>\n<p>Now, let&#8217;s install some Python packages we use for machine learning.  I really prefer to use <a href=\"https:\/\/pip.pypa.io\/en\/stable\/reference\/requirements-file-format\/\">`requirements.txt`<\/a> and enumerate all of the Python packages I&#8217;m going to install for whatever I&#8217;m working on.  There are a few common ones I&#8217;ve used for machine learning exercises:<\/p>\n<pre class=\"lang:default decode:true \" >\r\npandas\r\nnumpy\r\njupyter\r\nscikit-learn\r\n<\/pre>\n<p>Write all four of these in a text file named `requirements.txt` and then type:<\/p>\n<p>`pip3 install -r requirements.txt`<\/p>\n<p>Now, type `rehash`.<\/p>\n<p>Editor&#8217;s Note:  Strictly speaking one doesn&#8217;t need to include `numpy` as `pandas` relies on it and will include it.<\/p>\n<p>Once everything is installed (and you&#8217;ve run `rehash`), type `which jupyter`.  <\/p>\n<pre class=\"lang:default decode:true \" >\r\n% which jupyter\r\n\/Users\/joe\/projects\/my_project\/venv\/bin\/jupyter\r\n<\/pre>\n<p>You <i>should<\/i> see that the `jupyter` binary is in your virtual environment.<\/p>\n<h2>The Easiest Regression Exercise Ever<\/h2>\n<p>Let&#8217;s use our virtual Python environment with <a href=\"https:\/\/jupyter.org\/\">Jupyter Notebook<\/a>, <a href=\"https:\/\/pandas.pydata.org\">Pandas<\/a>, <a href=\"https:\/\/numpy.org\">Numpy<\/a> and <a href=\"https:\/\/scikit-learn.org\/stable\/\">Scikit Learn<\/a>.<\/p>\n<p>The following Python one-liner &#8220;generates&#8221; the function <\/p>\n<p>$$f(x) = 3x + 27$$<\/p>\n<p>for <i>x<\/i> in 1 through 9.<\/p>\n<pre class=\"lang:default decode:true \" >\r\n% python3 -c 'for i in range(1,10):  print(\"%d,%d\" % (i,3*i+27))' > regression.csv\r\n% cat regression.csv\r\n1,30\r\n2,33\r\n3,36\r\n4,39\r\n5,42\r\n6,45\r\n7,48\r\n8,51\r\n9,54\r\n<\/pre>\n<p>Editor&#8217;s Note:  If you&#8217;re in a particularly punchy mood, try<\/p>\n<pre class=\"lang:default decode:true \" >\r\npython3 -c 'import random; [print(\"%d,%f\" % (i,3*i+27+10*random.random())) for i in range(1,10)]'>regression.csv\r\n<\/pre>\n<p>to create a dataset whose correlation coefficient `r` is not 1.<\/p>\n<p>Create a Jupyter notebook by running `jupyter notebook&#038;` in your terminal window, and then, when the Jupyter homepage comes up, go to <b>File &#8211; New &#8211; Notebook<\/b>.<\/p>\n<p><a href=\"https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/jupyter.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/jupyter.png\" alt=\"\" width=\"545\" height=\"249\" class=\"aligncenter size-full wp-image-4905\" srcset=\"https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/jupyter.png 545w, https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/jupyter-300x137.png 300w\" sizes=\"(max-width: 545px) 100vw, 545px\" \/><\/a><\/p>\n<p>Doubleclick on the newly created notebook to open it, and in the first cell add:<\/p>\n<pre class=\"lang:default decode:true \" >\r\nimport pandas as pd\r\nimport numpy as np\r\n\r\ndf = pd.read_csv('regression.csv', names=['x','y'])\r\n<\/pre>\n<p><a href=\"https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/regress1.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/regress1.png\" alt=\"\" width=\"430\" height=\"441\" class=\"aligncenter size-full wp-image-4882\" srcset=\"https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/regress1.png 430w, https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/regress1-293x300.png 293w\" sizes=\"(max-width: 430px) 100vw, 430px\" \/><\/a><\/p>\n<p>In a new cell, add:<\/p>\n<pre class=\"lang:default decode:true \" >\r\nfrom sklearn.linear_model import LinearRegression\r\nfrom sklearn.model_selection import train_test_split\r\n\r\ntrain, test = train_test_split(df)\r\n\r\ntrain_X = train['x']\r\ntrain_y = train['y']\r\n\r\nreg = LinearRegression()\r\nreg.fit(np.array(train_X).reshape(-1,1), train_y)\r\n<\/pre>\n<p>I won&#8217;t go into the details of <a href=\"\">Scikit Learn<\/a>, but you should be able to gather that we are going to train a linear regression model that, given new x values, should be able to predict y values.  Since our data fits a perfect line, we&#8217;d expect pretty good predictions.  As in perfect ones!<\/p>\n<p><a href=\"https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/regress2.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/regress2.png\" alt=\"\" width=\"470\" height=\"439\" class=\"aligncenter size-full wp-image-4890\" srcset=\"https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/regress2.png 470w, https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/regress2-300x280.png 300w\" sizes=\"(max-width: 470px) 100vw, 470px\" \/><\/a><\/p>\n<p>In a new cell, add:<\/p>\n<pre class=\"lang:default decode:true \" >\r\nsome_x = np.array([[20], [30], [40]])\r\n\r\nreg.predict(some_x)\r\n<\/pre>\n<p>and the result should be `array([ 87., 117., 147.])`.  <\/p>\n<p>Nifty!<\/p>\n<h2>Wait, That&#8217;s It?<\/h2>\n<p>Not quite!  Our linear regression algorithm doesn&#8217;t take long on any computer, much less a MacBook Pro.  It is also rather boring.  Let&#8217;s look at something far more intensive and interesting:  image classification using a deep learning convolutional neural network.<\/p>\n<p>Create a new directory, something like `~\/projects\/imageclassifier` and create a Python virtual in it:<\/p>\n<pre class=\"lang:default decode:true \" >\r\ncd ~\/projects\/\r\nmkdir imageclassifier\r\ncd imageclassifier\r\npython3 -m venv venv\r\nsource venv\/bin\/activate\r\n<\/pre>\n<p>In a `requirements.txt` file add one line for now:<\/p>\n<pre class=\"lang:default decode:true \" >\r\ntensorflow\r\n<\/pre>\n<pre class=\"lang:default decode:true \" >\r\npip3 install -r requirements.txt\r\nrehash\r\n<\/pre>\n<p>We&#8217;re going to use Apple&#8217;s own test script for verifying TensorFlow is correctly installed:<\/p>\n<pre class=\"lang:default decode:true \" >\r\nimport tensorflow as tf\r\n\r\ncifar = tf.keras.datasets.cifar100\r\n(x_train, y_train), (x_test, y_test) = cifar.load_data()\r\nmodel = tf.keras.applications.ResNet50(\r\n    include_top=True,\r\n    weights=None,\r\n    input_shape=(32, 32, 3),\r\n    classes=100,)\r\n\r\nloss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)\r\nmodel.compile(optimizer=\"adam\", loss=loss_fn, metrics=[\"accuracy\"])\r\nmodel.fit(x_train, y_train, epochs=5, batch_size=64)\r\n<\/pre>\n<p>Editor&#8217;s note:  You can read more about the CIFAR100 dataset <a href=\"https:\/\/www.cs.toronto.edu\/~kriz\/cifar.html\">here<\/a>.<\/p>\n<p>Save the above code in a file named `imageclassifier.py` or something like that, and run it.<\/p>\n<pre class=\"lang:default decode:true \" >\r\ntime python3 imageclassifier.py\r\nEpoch 1\/5\r\n782\/782 [==============================] - 320s 407ms\/step - loss: 4.8848 - accuracy: 0.0618\r\nEpoch 2\/5\r\n782\/782 [==============================] - 317s 405ms\/step - loss: 4.3662 - accuracy: 0.0966\r\nEpoch 3\/5\r\n782\/782 [==============================] - 306s 391ms\/step - loss: 3.8930 - accuracy: 0.1386\r\nEpoch 4\/5\r\n782\/782 [==============================] - 304s 388ms\/step - loss: 3.7569 - accuracy: 0.1514\r\nEpoch 5\/5\r\n782\/782 [==============================] - 309s 396ms\/step - loss: 3.5246 - accuracy: 0.1892\r\npython3 imageclassifier.py  4922.85s user 1125.25s system 386% cpu 26:05.16 total\r\n<\/pre>\n<p>Yikes!  That took nearly 25 minutes on a 12-core CPU.<\/p>\n<p><a href=\"https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/cpuOnly.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/cpuOnly.png\" alt=\"\" width=\"644\" height=\"352\" class=\"aligncenter size-full wp-image-4913\" srcset=\"https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/cpuOnly.png 644w, https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/cpuOnly-300x164.png 300w\" sizes=\"(max-width: 644px) 100vw, 644px\" \/><\/a><\/p>\n<h3>Tensorflow Metal to the Rescue<\/h3>\n<p>Fortunately we have access to our Mac&#8217;s GPU through <a href=\"https:\/\/developer.apple.com\/metal\/tensorflow-plugin\/\">TensorFlow Metal<\/a>. In your `requirements.txt` file, add `tensorflow-metal` and run `pip3 install -r requirements.txt` again. <\/p>\n<pre class=\"lang:default decode:true \" >\r\ntime python3 imageclassifier.py\r\nEpoch 1\/5\r\n782\/782 [==============================] - 49s 59ms\/step - loss: 4.6411 - accuracy: 0.0827\r\nEpoch 2\/5\r\n782\/782 [==============================] - 45s 58ms\/step - loss: 4.2062 - accuracy: 0.1202\r\nEpoch 3\/5\r\n782\/782 [==============================] - 46s 58ms\/step - loss: 3.7102 - accuracy: 0.1712\r\nEpoch 4\/5\r\n782\/782 [==============================] - 47s 60ms\/step - loss: 3.5657 - accuracy: 0.1978\r\nEpoch 5\/5\r\n782\/782 [==============================] - 46s 59ms\/step - loss: 3.2704 - accuracy: 0.2424\r\npython3 imageclassifier.py  226.09s user 56.42s system 119% cpu 3:56.26 total\r\n<\/pre>\n<p>A bit under four minutes, and we&#8217;re done.  The GPU got a workout.<\/p>\n<p><a href=\"https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/cpu.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/cpu.png\" alt=\"\" width=\"355\" height=\"360\" class=\"aligncenter size-full wp-image-4915\" srcset=\"https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/cpu.png 355w, https:\/\/dev.iachieved.it\/iachievedit\/wp-content\/uploads\/2023\/07\/cpu-296x300.png 296w\" sizes=\"(max-width: 355px) 100vw, 355px\" \/><\/a><\/p>\n<h2>Conclusion<\/h2>\n<p>What I really want to stress in this post is the general pattern for Python development on the Mac:<\/p>\n<ul>\n<li>create a project directory\n<li>create a Python virtual environment with `python3 -m venv venv`\n<li>activate the environment with `source venv\/bin\/activate`\n<li>install required Python packages with `pip3 install -r requirements.txt` in your virtual environment\n<li>issue `rehash` to ensure any commands typed on the command line will be found in your virtual environment!\n<\/ul>\n<p>It really is &#8220;that easy&#8221; (famous last words)!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I wrote Part I of the Developing on a Mac series to provide a foundation upon which to build a collection of minimal guides to developing software on a Mac. In this post let&#8217;s look at what needs to be installed on your Mac for delving into Machine Learning with Python. If you haven&#8217;t read [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":3747,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[11,12],"tags":[],"class_list":["post-4870","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-apple","category-python"],"_links":{"self":[{"href":"https:\/\/dev.iachieved.it\/iachievedit\/wp-json\/wp\/v2\/posts\/4870"}],"collection":[{"href":"https:\/\/dev.iachieved.it\/iachievedit\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dev.iachieved.it\/iachievedit\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dev.iachieved.it\/iachievedit\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dev.iachieved.it\/iachievedit\/wp-json\/wp\/v2\/comments?post=4870"}],"version-history":[{"count":51,"href":"https:\/\/dev.iachieved.it\/iachievedit\/wp-json\/wp\/v2\/posts\/4870\/revisions"}],"predecessor-version":[{"id":4954,"href":"https:\/\/dev.iachieved.it\/iachievedit\/wp-json\/wp\/v2\/posts\/4870\/revisions\/4954"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/dev.iachieved.it\/iachievedit\/wp-json\/wp\/v2\/media\/3747"}],"wp:attachment":[{"href":"https:\/\/dev.iachieved.it\/iachievedit\/wp-json\/wp\/v2\/media?parent=4870"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dev.iachieved.it\/iachievedit\/wp-json\/wp\/v2\/categories?post=4870"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dev.iachieved.it\/iachievedit\/wp-json\/wp\/v2\/tags?post=4870"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}