The Latest in IT Security

Managing files download size in Bash

05
Aug
2011

I’ve been writing a lot of bash code today as a possible replacement for the current process for maintaining malwareblacklist.com.

The original code was written maybe a year ago by someone who no longer works in my department and because it is written in C#, I prefer not to touch it. At the time, the volume of URLs needed to be processed was much lower and therefore the program worked alright. But today we are getting thousands of malicious URLs every day that need full validation, and we just can’t handle it.

Anyway, I’ve picked Linux’s bash scripting language along side PHP (for DB queries). In this post I just wanted to share some code I sort of made up while trying to find a solution to a problem people may encounter.

Say you want to download one or several files (possibly simultaneously) but you want to have a maximum file size allowed and discard the files that are too big while downloading them. Obviously, you could wait until a file is done downloading and then discard it if it is too big, but you would be wasting precious time and bandwidth.

The following is a solution I came up with, and is by no means the most elegant or even best option. But it got me past this hurdle.

In this example, I am trying to download a large file (Ubuntu iso) with a maximum file size of 10 Mb. I’m using wget to pull down the file and then create a loop that will be checking the current size every other second. If it exceeds the limit, then we kill wget using its PID (so that we don’t inadvertently kill other instances of wget) and then discard the incomplete file.

If you have a better and quicker solution, I’m interested ;-)

Jerome Segura

Leave a reply


Categories

FRIDAY, APRIL 26, 2024
WHITE PAPERS

Mission-Critical Broadband – Why Governments Should Partner with Commercial Operators:
Many governments embrace mobile network operator (MNO) networks as ...

ARA at Scale: How to Choose a Solution That Grows With Your Needs:
Application release automation (ARA) tools enable best practices in...

The Multi-Model Database:
Part of the “new normal” where data and cloud applications are ...

Featured

Archives

Latest Comments