![]() ![]() Application server: This is your application code wrapped in a web server such as Flask or Gunicorn.Other popular web servers such as nginx can also be used. I use an Apache web server because that’s what I’m more familiar with. Web server running on the local host that directs the traffic received on the local host side of the SSH tunnel to the application server.These two ports can be same number because they are on separate computers. ![]() Note that there are two ports involved – one on the remote host that can receive traffic from the internet via a proxy (described in more detail later) and another on the localhost. Simply put, you open a tunnel between your local machine and a remote server so that you can use the tunnel to connect from the server to your local machine. When a connection is made to this port, the connection is forwarded over the SSH tunnel to a port on the local host. Remote SSH port forwarding: In this technique, the local host (i.e., host of the web server that needs to be exposed) initiates a SSH tunnel to the remote host (the computer open to the internet) and assigns a socket to listen to a port on the remote host.The website is served by an Apache server listening at the default http(s) ports. The EC2 instance provisioned by AWS Lightsail, where my website is hosted. ![]() There are 5 components of the full system (Figure 1). This got me thinking – can I somehow expose my application which runs as a local Flask web server to the internet? Turns out that this is indeed possible and not so difficult to do. Furthermore, I already have a reserved EC2 instance on AWS Lightsail where my website is hosted. I’ve also collected a number of laptops over the years that aren’t quite as powerful, but still pack quite a punch. Now I have a pretty beefy Ubuntu desktop with a 20 core CPU and 2 1080Ti GPUs at home that I use for my deep learning experiments. The cheapest GPU enabled on-demand instance type on AWS (at the time of this writing) is the p2.xlarge, which costs $0.90/hr (~$650/month), which is obviously insanely expensive. On a 1080 Ti GPU, the search takes ~3 seconds, while on a CPU it takes ~ 12 seconds. The trouble with this solution is that inference on BERT style models is very compute heavy. The obvious solution to do such a thing is to containerize the application and host it on AWS. I wanted to turn the search application running locally on my personal deep learning machine into a demo accessible over the web. The system works pretty well – see screenshot below. My solution (which I’ll be writing about in detail in future posts) consists of two components – document retrieval component to quickly identify the top documents and sentences therein that are relevant to a query and a span retrieval component that uses a BERT model fined-tuned on the SQuAD dataset to identify most relevant spans in the bodytext of the top matching documents. Over the last month or so, I have been working on a “Covid-19 search engine” that aims to datamine the ~44000 or so articles about Covid-19 and related infectious diseases to answer simple questions such as – “what are the biggest risk factors for Covid-19?”, “How is Covid-19 transmitted?” etc. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |