Get File Length over http before you download it

August 28, 2010

This is one of those forum questions that you “think” you know the answer to, and then you’re proven wrong.
User wants to download a file from a remote site but they do not want to proceed with the download if the file is larger than 10MB. Make sense, right?
I said there was no way to do this without downloading the file. I was wrong. Here’s how he solved his own problem:

static void Main(string[] args)
{
string completeUrl = 
"http://www.eggheadcafe.com/FileUpload/1145921998_ObjectDumper.zip";
WebClient obj = new WebClient();
Stream s = obj.OpenRead(completeUrl);
Console.WriteLine( obj.ResponseHeaders["Content-Length"].ToString());
s.Close();
obj = null;
Console.ReadLine();
}

The above correctly reports the file size of 85,827 bytes without ever downloading the file!

Somebody had a problem. Instead of giving up (or worse, taking my so-called “expert advice”) he thought “outside the box” and found a solution. I call that outstanding!

NOTE: At least one commenter pointed out that a HEAD request is the most efficient. That's true, but in my experience not all HEAD requests work on all sites so you actually may need to make 2 requests if the first one fails.

Comments

Jef Claes5:24 PM
That's a neat trick!
ReplyDelete
Replies
Huseyin Tufekcilerli5:47 PM
Alternatively, you can make a HEAD request to that web resource, see:

http://forrst.com/posts/HEAD_requst_to_get_ContentLength-yVh
ReplyDelete
Replies
Anonymous8:37 PM
Peter, unfortunately what you (and the forum poster) are stating is only partially true. Although the entire file is not downloaded, part of the file is downloaded.

As soon as the server responds to your GET request, it sends the HTTP headers along with the beginning of the file (whatever can fit in the few few packets). Once .NET processes the first few packets of data, the code above forcibly terminates the connection with the server.

However, if you look at the wireshark capture below, you can see some of the zip file was sent across the wire. The commenter who posted about using HEAD is stating the correct mechanism for doing what the poster asked.

Hope that helps,
John

GET /FileUpload/1145921998_ObjectDumper.zip HTTP/1.1
Host: www.eggheadcafe.com
Connection: Keep-Alive

HTTP/1.1 200 OK
Content-Type: application/x-zip-compressed
Last-Modified: Sat, 28 Aug 2010 20:00:56 GMT
Accept-Ranges: bytes
ETag: "ad35ebaeb46cb1:0"
Server: Microsoft-IIS/7.0
X-Powered-By: ASP.NET
Date: Mon, 30 Aug 2010 01:23:29 GMT
Content-Length: 85827

PK...........=,.O.....p.......ObjectDumper.sln...n.@.......^..%.....v.m..0.....j.A.....>Y.}..BwS...*..3.g......O]...Tdb...<+....E....I.s13(OX..".FR..L%m..P.....h.l/./l.....C........v...". .

(I truncated the wireshark capture here)
ReplyDelete
Replies
Zubair.NET!11:58 PM
This means loading the entire file in memory to get the size? looks so inefficient to me.
ReplyDelete
Replies
peterbromberg5:09 PM
@Zubair.NET!
As Huseyin and John both pointed out, making a HEAD request is the best solution.
ReplyDelete
Replies
it outsourcing services6:08 AM
As soon as the server responds to your GET request, it sends the HTTP headers along with the beginning of the file (whatever can fit in the few few packets). Once .NET processes the first few packets of data, the code above forcibly terminates the connection with the server.
ReplyDelete
Replies