DevHeads.net

script to make webpage snapshot

Dear Experts,

Could someone recommend a script or utility one can run from command line
on Linux or UNIX machine to make a snapshot of webpage?

We have a signage (xibo) and whoever creates/changes content, likes to add
URLs of some webpages there. All works well if these are webpages on our
servers (which are pretty fast), but some external servers often take time
to respond and take time to assemble the page, in addition these servers
sometimes get really busy, and when response is longer than time devoted
for that content in signage window, this window hangs forever with blank
white field until you restart client. Trivial workaround: just to get
snapshot (as, say daily cron job), and point signage client to that
snapshot definitely will solve it, and simultaneously we will stop bugging
other people servers often without much need for it.

But when I tried to search for some utility or script that makes webpage
snapshot, I discovered that my ability to search degraded somehow...

Thanks for all your pointers!

Valeri
++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev
Sr System Administrator
Department of Astronomy and Astrophysics
Kavli Institute for Cosmological Physics
University of Chicago
Phone: 773-702-4247
++++++++++++++++++++++++++++++++++++++++

Comments

Re: script to make webpage snapshot

By Anthony at 08/12/2016 - 05:55

On 12/08/16 06:46, Valeri Galtsev wrote:
CmdShots is a FireFox add-on that takes full-page screenshots through
the Command-line.

*[0]* <a href="https://github.com/omarabid/CmdShots" title="https://github.com/omarabid/CmdShots">https://github.com/omarabid/CmdShots</a>

Re: script to make webpage snapshot

By Anthony at 08/12/2016 - 05:59

On 12/08/16 19:55, Anthony K wrote:
For completeness sake, the author first sought a solution on Stack
Overflow [1]. When one was not forthcoming, he created his own solution.

[1]
<a href="http://stackoverflow.com/questions/13158083/take-a-full-page-screenshot-with-firefox" title="http://stackoverflow.com/questions/13158083/take-a-full-page-screenshot-with-firefox">http://stackoverflow.com/questions/13158083/take-a-full-page-screenshot-...</a>

Re: script to make webpage snapshot

By John R Pierce at 08/11/2016 - 18:02

On 8/11/2016 1:46 PM, Valeri Galtsev wrote:
many/most webpages these days are heavily dynamic content, a static
snapshot would likely break. plus any site-relative links on that
snapshot would be pointing to your server, not the original, any ajax
code on that webpage would try to interact with your server which won't
be running the right back end stuff, etcetc.

Re: script to make webpage snapshot

By Valeri Galtsev at 08/11/2016 - 18:10

On Thu, August 11, 2016 5:02 pm, John R Pierce wrote:
I usually am not good at explaining what I need. I really only need an
image of what one would see in web browser if one point to that URL. I do
not care it to be interactive. I also don't want to get the content
("mirror") of stuff that URL points to on variety of "depths" - I don't
want to use wget or curl for this reason. That is what I tried first and
it breaks with at lest one of the web sites - they do seem protect
themselves from "robots" or similar. And we don't need it. We just need to
show what they page shows today, that's all.

Valeri

++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev
Sr System Administrator
Department of Astronomy and Astrophysics
Kavli Institute for Cosmological Physics
University of Chicago
Phone: 773-702-4247
++++++++++++++++++++++++++++++++++++++++

Re: script to make webpage snapshot

By John R Pierce at 08/11/2016 - 18:27

On 8/11/2016 3:10 PM, Valeri Galtsev wrote:
then screen capture is about it.... too many sites, ALL the content is
dynamic, for instance,
https://www.google.com/maps/@36.9460899,-122.0268105,664a,20y,41.31t/data=!3m1!1e3

that page is composed of tiles of image data superimposed on the fly
with ajax code running in the browser to fetch the layers displayed.

you simply can't fetch the html and make any sense out of it, the
browser is running a complex application to display that.

Re: script to make webpage snapshot

By Valeri Galtsev at 08/11/2016 - 18:32

On Thu, August 11, 2016 5:27 pm, John R Pierce wrote:
Yes, I understand as much, thanks. I'm still sure it is not hopeless task.

Valeri

++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev
Sr System Administrator
Department of Astronomy and Astrophysics
Kavli Institute for Cosmological Physics
University of Chicago
Phone: 773-702-4247
++++++++++++++++++++++++++++++++++++++++

Re: script to make webpage snapshot

By Dave Stevens at 08/11/2016 - 18:13

Quoting Valeri Galtsev < ... at kicp dot uchicago.edu>:

why not File -> Print -> .pdf?

D

Re: script to make webpage snapshot

By Liam O'Toole at 08/11/2016 - 17:13

On 2016-08-11, Valeri Galtsev
< ... at kicp dot uchicago.edu> wrote:
Not an answer to the question you asked, but maybe this is a job for a
caching proxy server like squid?

Re: script to make webpage snapshot

By Valeri Galtsev at 08/11/2016 - 18:13

On Thu, August 11, 2016 4:13 pm, Liam O'Toole wrote:
Thanks! It didn't occur to me. It will be much more sophisticated than
just an image "snapshot" of the webpage, but should solve our problem. If
I don't find anything doing "snapshot" successfully, this is what I will
do.

Valeri

++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev
Sr System Administrator
Department of Astronomy and Astrophysics
Kavli Institute for Cosmological Physics
University of Chicago
Phone: 773-702-4247
++++++++++++++++++++++++++++++++++++++++

Re: script to make webpage snapshot

By Frank Cox at 08/11/2016 - 16:53

On Thu, 11 Aug 2016 15:46:42 -0500 (CDT)

wget?
httrack?