1. EachPod

HPR4407: A 're-response' Bash script

Author
[email protected] (Dave Morriss)
Published
Tue 24 Jun 2025
Episode Link
https://hackerpublicradio.org/eps/hpr4407/index.html

Introduction

On 2025-06-19 Ken Fallon did a show, number
4404
,
responding to Kevie's show
4398
,
which came out on 2025-06-11.

Kevie was using a Bash pipeline to find the latest episode in an RSS
feed, and download it. He used
grep
to parse the XML of the
feed.

Ken's response was to suggest the use of
xmlstarlet
to
parse the XML because such a complex structured format as XML cannot
reliably be parsed without a program that "understands" the intricacies
of the format's structure. The same applies to other complex formats
such as HTML, YAML and JSON.

In his show Ken presented a Bash script which dealt with this problem
and that of the ordering of episodes in the feed. He asked how others
would write such a script, and thus I was motivated to produce this
response to his response!

Alternative script

My script is a remodelling of Ken's, not a completely different
solution. It contains a few alternative ways of doing what Ken did, and
a reordering of the parts of his original. We will examine the changes
in this episode.

Script

#!/bin/bash
# Original (c) CC-0 Ken Fallon 2025
# Modified by Dave Morriss, 2025-06-14 (c) CC-0

podcast="https://tuxjam.otherside.network/feed/podcast/"

# [1]
while read -r item
do
# [2]
pubDate="${item%;*}"
# [3]
pubDate="$( \date --date="${pubDate}" --universal +%FT%T )"
# [4]
url="${item#*;}"
# [5]
echo "${pubDate};${url}"
done < <(curl --silent "${podcast}" | \
xmlstarlet sel --text --template --match 'rss/channel/item' \
--value-of 'concat(pubDate, ";", enclosure/@url)' --nl - ) | \
sort --numeric-sort --reverse | \
head -1 | \
cut -f2 -d';' | wget --quiet --input-file=- # [6]

I have placed some comments in the script in the form of

'# [1]'
and I'll refer to these as I describe the changes
in the following numbered list.

Note:
I checked, and the script will run with the
comments, though they are only there to make it easier to refer to
things.

The format of the pipeline is different. It starts by defining a

while
loop, but the data which the
read

command receives comes from a
process substitution
of the form

'<(statements)'
(see the

process
substitution section
of "hpr2045 :: Some other Bash tips"
). I
have arranged the pipeline in this way because it's bad practice to
place a
while
in a pipeline, as discussed in the show:
hpr3985 :: Bash snippet - be careful when feeding data to
loops
.

(I added
-r
to the
read
because

shellcheck
, which I run in the
vim
editor,
nagged me!)

The lines coming from the
process substitution
are from
running
curl
to collect the feed, then using

xmlstarlet
to pick out the
pubDate
field of
the item, and the
url
attribute of the

enclosure
field returning them as two strings separated by
a semicolon (
';'
). This is from Ken's original code. Each
line is read into the variable
item
, and the first element
(before the semicolon) is extracted with the Bash expression

"${item%;*}"
.

Share to: