Changeset 1066


Ignore:
Timestamp:
Aug 1, 2017, 4:30:24 PM (7 years ago)
Author:
iritscen
Message:

Updating Val to new location of files on oni2.net and slight changes to Archive API.

Location:
Validate External Links
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • Validate External Links/Run validate_external_links.command

    r1064 r1066  
    1212LINKS_ONLINE="http://wiki.oni2.net/w/extlinks.csv"
    1313EXCEPT_LOCAL="file:///path/to/Validate External Links/exceptions.txt"
    14 EXCEPT_ONLINE="http://iritscen.oni2.net/wiki/exceptions.txt"
     14EXCEPT_ONLINE="http://iritscen.oni2.net/val/exceptions.txt"
    1515REPORT_DIR="/path/to/ValExtLinks reports"
    1616UPLOAD_INFO="/path/to/Validate External Links/sftp_login.txt"
  • Validate External Links/sftp_login.txt

    r1064 r1066  
    22pw:
    33port:52010
    4 path:http/wiki
     4path:http/val
  • Validate External Links/validate_external_links.sh

    r1064 r1066  
    2626
    2727# Fixed strings -- see the occurrences of these variables to learn their purpose
    28 AGENT="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3146.0 Safari/537.36"
     28AGENT="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:54.0) Gecko/20100101 Firefox/54.0"
    2929ARCHIVE_API="http://archive.org/wayback/available"
    3030ARCHIVE_GENERIC="https://web.archive.org/web/*"
     
    3232CHROME="/Applications/Google Chrome Canary.app/Contents/MacOS/Google Chrome Canary"
    3333CHROME_SCREENSHOT="screenshot.png"
    34 CURL_CODES="http://iritscen.oni2.net/wiki/curl_codes.txt"
     34CURL_CODES="http://iritscen.oni2.net/val/curl_codes.txt"
    3535EXPECT_SCRIPT_NAME="val_expect_sftp.txt"
    36 HTTP_CODES="http://iritscen.oni2.net/wiki/http_codes.txt"
     36HTTP_CODES="http://iritscen.oni2.net/val/http_codes.txt"
    3737MY_WIKI_PAGE="http://wiki.oni2.net/User:Iritscen"
    3838THIS_DIR=$(cd $(dirname $0); pwd)
     
    5151# These arrays tells us which HTTP response codes are OK (good) and which are NG (no good). Pages that
    5252# return NG codes will not be screenshotted. Remember to update http_codes.txt if you add a new code.
    53 declare -a OK_CODES=(200 301 302 307 401 405 406 501)
    54 declare -a NG_CODES=(000 403 404 410 500 503)
     53declare -a OK_CODES=(200 301 307 401 405 406 501)
     54declare -a NG_CODES=(000 302 403 404 410 500 503)
    5555
    5656# Characters not allowed in a URL. Curly braces are sometimes used on the wiki to build a link using
     
    746746         ARCHIVE_QUERY=$(curl --silent --max-time 10 "$ARCHIVE_API?url=$URL&$ARCHIVE_OK_CODES")
    747747
    748          # Isolate "url" property in response and log it if received...
    749          if [[ "$ARCHIVE_QUERY" == *\"url\":* ]]; then
    750             SNAPSHOT_URL=${ARCHIVE_QUERY#*\"url\":\"}
    751             SNAPSHOT_URL=${SNAPSHOT_URL%\",\"timestamp*}
     748         # Isolate "url" property in response and log it if a "closest" snapshot was received...
     749         if [[ "$ARCHIVE_QUERY" == *\"closest\":* ]]; then
     750            SNAPSHOT_URL=${ARCHIVE_QUERY##*\"url\": \"}
     751            SNAPSHOT_URL=${SNAPSHOT_URL%\", \"timestamp*}
    752752            valPrint t "  IA suggests $SNAPSHOT_URL"
    753753            valPrint r "                IA suggests {\field{\*\fldinst{HYPERLINK \"$SNAPSHOT_URL\"}}{\fldrslt $SNAPSHOT_URL}}"
Note: See TracChangeset for help on using the changeset viewer.