Yahoo Answers is shutting down on May 4th, 2021 (Eastern Time) and beginning April 20th, 2021 (Eastern Time) the Yahoo Answers website will be in read-only mode. There will be no changes to other Yahoo properties or services, or your Yahoo account. You can find more information about the Yahoo Answers shutdown and how to download your data on this help page.

Python: Getting files, downloading them, and moving them?

I have a list of links that I have gotten from an XML document with Python. I want to get the latest link, get the file that it links to, and move that file to a directory on my computer. How can I do this?

1 Answer

Relevance
  • 1 decade ago
    Favorite Answer

    you want to look at the urllib module, as for getting the latest link, it depends how your xml is formatted assume the xml is something like this:

    (we'll call it myxml.xml)

    <?xml version="1.0" ?>

    <links>

    <link>http://newest.com/file.txt%3C/link%3E

    <link>http://older.com/file.txt%3C/link%3E

    </links>

    from xml.dom import minidom

    import urllib

    xmldoc = minidom.parse("myxml.xml")

    linknode = xmldoc.childNodes[0]

    urldata = linksnode.childNodes[1]

    urllib.urlretrieve(urldata.firstChild.data, "C:\\whereyouwant\\")

    # >_< yahoo is cutting it off, its .data, after firstChild

    ok so linknode is equal to the first child of xmldoc which is links

    urldata is equal to the second child of links which is link (why the second and no the first? because the python xml parser isnt all that great and lists the first child as '\n' the newline :/ )

    the urldata.firstchild.data lists the actual string inside the brackets

    EDIT: yahoo is being weird with the xml I put in, you know how xml works, open tag, close tag, make sure to format it appropriately and not the way its showing above

    i.e. [link]url[/link] and NOT the [link][\link]url[\link] that yahoo is showing

Still have questions? Get your answers by asking now.