Showing posts with label network. Show all posts
Showing posts with label network. Show all posts

6 Jul 2024

Ollama is missing --rate-limits on downloads

I am just starting my AI journey, and trying to get Ollama to work on my linux box, was an interesting non-AI experience.

I noticed, that everytime I was trying out something new, my linux box got reliably stuck every single time I pulled a new model. htop helped point out, that each time I did a ollama pull or ollama run, it spun up a ton of threads.

Often things got so bad, that the system became quite unresponsive. Here, you can see "when" I triggered the pull:

Reply from 192.168.85.24: bytes=32 time=7ms TTL=64
Reply from 192.168.85.24: bytes=32 time=7ms TTL=64
Reply from 192.168.85.24: bytes=32 time=7ms TTL=64
Reply from 192.168.85.24: bytes=32 time=8ms TTL=64
Reply from 192.168.85.24: bytes=32 time=65ms TTL=64
Reply from 192.168.85.24: bytes=32 time=286ms TTL=64
Reply from 192.168.85.24: bytes=32 time=286ms TTL=64
Reply from 192.168.85.24: bytes=32 time=304ms TTL=64

A little searching, led me to this on-going Github thread where a feature like --rate-limit were requested for multiple reasons. Some people were unhappy with how a pull clogged their routers, some were unhappy with how it jammed all other downloads / browsing on the machine. I was troubled since my linux box (a not-so-recent but still 6.5k BogoMIPS 4vCPU i5) came to a crawl.

While the --rate-limit feature takes shape, here are two solutions that did work for me :

  1. As soon as I started the fetch (ollama run or ollama pull etc), I used iotop to change the ionice priority to idle. This made the issue go away completely (or at least made the system quite usable). However, it was still frustrating since (unlike top and htop) one had to type the PIDs... and as you may have guessed it already, Ollama creates quite a few when it does such the fetch.

Note that doing something like nice -n 19 did not help here. This was because the ollama processes weren't actually consuming (much) CPU for this task at all!

Then I tried to use ionice, which didn't work either! Note that since Ollama uses threads, the ionice tool didn't work for me. This was because ionice doesn't work with threads within a parent process. So this meant, something like the following did not work for me:

# These did not help!

robins@dell:~$ nice -n 19 ollama run mistral # Did not work!
robins@dell:~$ ionice -c3 ollama run mistral # Did not work either!!
  1. After some trial-and-error, a far simpler solution was to just run a series of commands immediately after triggered a new model fetch. Essentially, it got the parent PID, and then set ionice for each of the child processes for that parent:
pid=`ps -ef | grep "ollama run" | grep -v grep | awk '{print $2}'`
echo $pid
sudo ionice -c3 -p `ps -T -p $pid | awk '{print $2}' | grep -v SPID | tr '\r\n' ' '`

This worked something like this:

robins@dell:~$ pid=`ps -ef | grep "ollama run" | grep -v grep | awk '{print $2}'` && [ ${#pid} -gt 1 ] && ( sudo ionice -c3 -p `ps -T -p $pid | awk '{print $2}' | grep -v SPID | tr '\r\n' ' '` ; echo "done" ) || echo "skip"skip
robins@dell:~$ pid=`ps -ef | grep "ollama run" | grep -v grep | awk '{print $2}'` && [ ${#pid} -gt 1 ] && ( sudo ionice -c3 -p `ps -T -p $pid | awk '{print $2}' | grep -v SPID | tr '\r\n' ' '` ; echo "done" ) || echo "skip"done

After the above, iotop started showing idle in front of each of the ollama processes:

Total DISK READ:         0.00 B/s | Total DISK WRITE:         3.27 M/s
Current DISK READ:       0.00 B/s | Current DISK WRITE:      36.76 K/s
    TID  PRIO  USER     DISK READ DISK WRITE>    COMMAND                                                                                                                                                                                                                      2692712 idle ollama      0.00 B/s  867.62 K/s ollama serve
2705767 idle ollama      0.00 B/s  852.92 K/s ollama serve
2692707 idle ollama      0.00 B/s  849.24 K/s ollama serve
2693740 idle ollama      0.00 B/s  783.07 K/s ollama serve
      1 be/4 root        0.00 B/s    0.00 B/s init splash
      2 be/4 root        0.00 B/s    0.00 B/s [kthreadd]
      3 be/4 root        0.00 B/s    0.00 B/s [pool_workqueue_release]
      4 be/0 root        0.00 B/s    0.00 B/s [kworker/R-rcu_g]
      5 be/0 root        0.00 B/s    0.00 B/s [kworker/R-rcu_p]
      6 be/0 root        0.00 B/s    0.00 B/s [kworker/R-slub_]

While at it, it was funny to note that the fastest way to see whether the unresponsive system is "going to" recover (because of what I just tried) was by keeping a separate ping session to the linux box. On my local network, I knew the system is going to come back to life in the next few seconds, when I noticed that the pings begin ack'ing in 5-8ms instead of ~100+ ms during the logjam.

So yeah, +10 on the --rate-limit or something similar!

Reference:

  1. https://github.com/ollama/ollama/issues/2006

24 Dec 2016

Watch an Online Movie: (Wget -c || Deluge) && Chrome > Chromecast

For all those who are in an odd situation where:


  • They have a paid account to a Movie site (like Netflix etc.)
  • Are unable to watch movies online, just because the video-streaming is just too slow
    • Either because your convenient times are 'peak' times for the server
    • Or, you are behind a painfully bad ISP
  • And are able to download the movie, as an option.

To such customers, downloading the movie overnight (using for e.g. wget) would be a big help!

I regularly use this, to download the movie, and watch with non-tech people (my kids) who can't be explained why the movie keeps 'Buffering'!

c:\bin\wget \
  -O KD1242.mp4 \ # Output filename
  -t 0 \ # Retrying indefinitely
  -c -T 10 \ # Reconnect + Timeouts are 10 seconds
  -w 10 \ # wait 10 seconds before retrying a disconnection
"https://www.yourfavouritechannel.com/abcd/KD1242.mp4?g=1f3410635&sha1=JGgJm02BOvqgsdvC32BcUg"

Windows binaries for GPL'ed GNU software (such as Wget) are heaven sent here:



And if you need a Big-TV experience (for e.g. if you have a Chromecast), you could stitch things together by using the Google Cast extension for your Chrome Browser, and open up "c:\" on the browser to play the movie directly in the browser (since VLC stream is still in Beta):


If by chance you're downloading videos via torrents (For e.g. NASA videos), here are my GPL recommendations for Windows:

  • Deluge: If you haven't seen this, you should really replace your uTorrent etc. clients with this one
  • Use the Streaming Extension (Github Link)
  • Copy the URL that the Extension provides + Paste to Chrome
  • Cast your tab to your Chromecast + Enjoy!


Have Fun!

18 Dec 2015

Getting my hands on a Google OnHub

So finally, I get to lay my hands on the new Google OnHub at home.

Unlike its other Google cousins, the OnHub isn't yet available for sale in India and then its a niche product (yet) here in India. The other obvious question is whether you'd pay 6x the price for a router, even when a brand like Google sports it.

I think I would and so was finally able to put my foot down when I realised that I had had enough of other routers making things difficult at home.

A few features I liked:
  • Automatic Software update
    • I loved this aspect that is pretty much missing in all its contemporaries
      • Considering that my 10 year old D-Link 502T hasn't received a single firmware / OS update I was scared to death what all crapware was running on my Home WiFi a month back.
      • That coupled with a few Einsteinish Router companies forced to admit secret (idiotically planned) backdoors, it just isn't funny to realise that my Home Router was probably a 'piece of cake' for a script kiddie trying to login.
  • Prioritize a phone
    • Again its a pleasure showing it to my wife how easy it is for her to prioritize her phone, when the kids are watching YouTube in HQ.
  • Router configuration a breeze
    • Its super simple to manage
      • I just recently got a Chromebook for Audio replacement working at home, and it was pleasant to realise that setting static IP address wasn't about setting /etc/network/interfaces anymore. The PI2 stayed on DHCP and I set the OnHub to give the MusicBox a static IP hereon... QED :) !
  • Manage your OnHub from China!
    • Once configured, you could manage your router sitting hundreds of miles away!
      • Which basically means no long calls to your GrandMa asking her to read out what is on the screen when she types 'http://192.168.1.1' on the browser.
      • You could be managing multiple OnHubs on your phone, each sitting at your parents place hundreds of miles away (without VNC / TeamViewer / RDP hacks) and still configure every minute detail such as setting Port-Forwarding / DNS / DHCP etc. from your SmartPhone.
  • WiFi connection optimization
    • Frankly, with so many walls, some remote corners of my house have seen some network quality degradation at times, but I haven't seen a 'No Network' message yet. So probably its doing a good job there, but I am sure I can't tell that right away.
Add to that, if we consider that this machine is a dual core machine (with a GPU) most of which isn't even put to use (yet), I am pretty excited to know what its real potential is and how Google upgrades my 'boring router' down the line.

Rumour is that this might just be a Google's shot at an Echo or a Siri sitting in your drawing room. But till that happens, I'd have to stay pleased with a beautiful router sitting on the desk :)

Now you may want to get paranoid and all and worry about how Google could keep an eye on Dr. Lanning (you), but I have a feeling that it'd take a while before I give breadcrumbs to a Detective Spooner.

All in all, a (pretty) costly router upgrade but I ain't regretting it.



What's in an empty table?

How much storage does an empty table in Postgres take? This is a post about Postgres tables that store ... well basically ...  Nothing . The...