There are numerous issues with this code.
'(use [net.cgrand.enlive-html])
This does not bring in a library, it creates a literal list, and does nothing with it:
user> (class '(use [net.cgrand.enlive-html]))
clojure.lang.PersistentList
it is effectively a no-op.
(def ^:dynamic *ref-modifier* `(remove :content))
This creates a two element list, not a "modifier" of any sort.
(defmacro extract-re [node re modifier]
`(doseq [seqs (map :content (node))]
(re-find re (apply str (modifier seqs)))))
Here you use syntax-quote, but you never unquote anything inside it. The macro doesn't use any of its arguments in any way.
You seem to want to apply modifier
as if it were a function (this does not even begin to happen, see the above quoting issues), but as we see in the actual call, modifier
is a two element list, and would cause an error if called.
Finally, doseq
only works for side effects, and always returns nil. The doseq block does not use the value generated by the re-find
, so the doseq body is effectively a no-op.
Additionally, I see dubious utility in using dynamic var declarations for vars that will be supplied as explicit function arguments.
With all of these issues addressed, I think we are closer to something that works:
(use 'net.cgrand.enlive-html)
(def ^:dynamic *base-url*
"https://www.impacttest.com/research/?Clinical-Research-Database-4")
(def ^:dynamic *ref-selector* [:div#content_1 :ul :li])
(defn fetch-url [url]
(html-resource (java.net.URL. url)))
(defn references []
(select (fetch-url *base-url*) *ref-selector*))
(def ^:dynamic *ref-regex* #"\s([A-Z]{1}[\w|\s]+)[,|\.]")
(def ^:dynamic *ref-modifier* (partial remove :content))
(defn extract-re [node re modifier]
(doall
(for [sq (map :content (node))]
(re-find re (apply str (modifier sq))))))
and in action:
user> (extract-re references *ref-regex* *ref-modifier*)
([" Dambinova SA," "Dambinova SA"] [" Zuckerman SL," "Zuckerman SL"] [" Conklin HM," "Conklin HM"] [" Covassin T," "Covassin T"] [" Maerlender A," "Maerlender A"] [" Fedor A," "Fedor A"] [" Resch J," "Resch J"] [" Elbin RJ," "Elbin RJ"] [" Rabinowitz AR," "Rabinowitz AR"] [" Kinnaman KA," "Kinnaman KA"] [" Tsushima WT," "Tsushima WT"] [" Amonette WE," "Amonette WE"] [" Lovell MR," "Lovell MR"] [" Schatz P," "Schatz P"] [" McGrath N," "McGrath N"] [" Kontos AP," "Kontos AP"] [" AB," "AB"] [" Meehan WP," "Meehan WP"] [" Rieger BP," "Rieger BP"] [" Solomon GS," "Solomon GS"] [" Sandel NK," "Sandel NK"] [" Schatz P," "Schatz P"] [" Schatz P," "Schatz P"] [" Lebrun CM," "Lebrun CM"] [" Brooks B," "Brooks B"] [" Meehan WP," "Meehan WP"] [" Fakhran S," "Fakhran S"] [" Cole WR," "Cole WR"] [" Tsushima M," "Tsushima M"] [" Zuckerman SL," "Zuckerman SL"] [" JK," "JK"] [" Covassin T," "Covassin T"] [" Moser RS," "Moser RS"] [" Mayers LB," "Mayers LB"] [" McAllister TW," "McAllister TW"] [" Meehan WP 3rd," "Meehan WP 3rd"] [" Neal MT," "Neal MT"] [" Lau BC," "Lau BC"] [" Kontos AP," "Kontos AP"] [" Gardner A," "Gardner A"] [" Elbin RJ," "Elbin RJ"] [" Wolf EG," "Wolf EG"] [" Reddy CC," "Reddy CC"] [" Moser RS," "Moser RS"] [" Guerriero RM," "Guerriero RM"] [" Deibert E," "Deibert E"] [" Wiebe DJ," "Wiebe DJ"] [" Baillargeon A," "Baillargeon A"] [" Erdal K." "Erdal K"] [" Maugans TA," "Maugans TA"] [" Iverson GL," "Iverson GL"] [" Ponsford J," "Ponsford J"] [" Schatz P," "Schatz P"] [" Mulligan I," "Mulligan I"] [" Echlin PS," "Echlin PS"] [" McLeod TC," "McLeod TC"] [" Zuckerman SL," "Zuckerman SL"] [" Kontos AP," "Kontos AP"] [" Zuckerman SL," "Zuckerman SL"] [" Schatz P," "Schatz P"] [" Kontos AP," "Kontos AP"] [" Covassin T," "Covassin T"] [" Covassin T," "Covassin T"] [" Duhaime AC," "Duhaime AC"] [" Echemendia RJ," "Echemendia RJ"] [" Ramanathan DM," "Ramanathan DM"] [" Meehan WP 3rd," "Meehan WP 3rd"] [" Krol AL," "Krol AL"] [" Turgeon C," "Turgeon C"] [" Randolph C." "Randolph C"] [" Barlow M," "Barlow M"] [" Schatz P," "Schatz P"] [" Moser RS," "Moser RS"] [" Broglio SP," "Broglio SP"] [" Thomas DG," "Thomas DG"] [" Allen BJ," "Allen BJ"] [" Solomon GS," "Solomon GS"] [" Ponsford J," "Ponsford J"] [" Johnson EW," "Johnson EW"] [" Randolph C," "Randolph C"] [" Elbin RJ," "Elbin RJ"] [" Broglio SP," "Broglio SP"] [" Kontos AP," "Kontos AP"] [" Lau BC," "Lau BC"] [" Lau BC," "Lau BC"] [" Hettich T," "Hettich T"] [" Elbin T," "Elbin T"] [" Maerlender A," "Maerlender A"] [" Kontos AP," "Kontos AP"] [" Talavage TM," "Talavage TM"] [" Meehan WP 3rd," "Meehan WP 3rd"] [" Lange RT," "Lange RT"] [" Covassin T," "Covassin T"] [" Schatz P." "Schatz P"] [" Lange RT," "Lange RT"] [" Pardini JE," "Pardini JE"] [" Echlin PS," "Echlin PS"] [" Schatz P," "Schatz P"] [" Echlin PS," "Echlin PS"] [" Keightley ML," "Keightley ML"] [" McGrath N." "McGrath N"] [" Covassin T," "Covassin T"] [" Pontifex MB," "Pontifex MB"] [" AB," "AB"] [" Casson IR," "Casson IR"] [" McCrory P," "McCrory P"] [" Covassin T," "Covassin T"] [" Bruce JM," "Bruce JM"] [" Covassin T," "Covassin T"] [" Lovell M." "Lovell M"] [" Lau B," "Lau B"] [" Nance ML," "Nance ML"] [" Peterson SE," "Peterson SE"] [" Lovell M." "Lovell M"] [" Broglio SP," "Broglio SP"] [" Broglio SP," "Broglio SP"] [" Colvin AC," "Colvin AC"] [" Reddy CC," "Reddy CC"] [" Solomon GS," "Solomon GS"] [" Covassin T," "Covassin T"] [" Majerske CW," "Majerske CW"] [" Lovell MR," "Lovell MR"] [" AB," "AB"] [" Tsushima WT," "Tsushima WT"] [" Miller JR," "Miller JR"] [" Slobounov S," "Slobounov S"] [" Mihalik JP," "Mihalik JP"] [" Covassin T," "Covassin T"] [" Lovell MR," "Lovell MR"] [" Stoller KP." "Stoller KP"] [" Broglio SP," "Broglio SP"] [" Moser RS," "Moser RS"] [" Iverson G." "Iverson G"] [" Fazio VC," "Fazio VC"] [" Swanik CB," "Swanik CB"] [" Broglio SP," "Broglio SP"] [" Covassin T," "Covassin T"] [" Broglio SP," "Broglio SP"] [" Chen JK," "Chen JK"] [" Van Kampen DA," "Van Kampen DA"] [" Broglio SP," "Broglio SP"] [" Pellman EJ," "Pellman EJ"] [" Pellman EJ," "Pellman EJ"] [" Schatz P," "Schatz P"] [" Biasca N," "Biasca N"] [" Collins M," "Collins M"] [" Lovell MR," "Lovell MR"] [" Lovell MR," "Lovell MR"] [" Iverson GL," "Iverson GL"] [" Cantu RC," "Cantu RC"] [" McClincy MP," "McClincy MP"] [" Schatz P," "Schatz P"] [" Iverson GL," "Iverson GL"] [" Van Kampen DA," "Van Kampen DA"] [" Lovell M," "Lovell M"] [" Mihalik JP," "Mihalik JP"] [" Moser RS," "Moser RS"] [" Broshek DK," "Broshek DK"] [" Grove R," "Grove R"] [" McCrea M," "McCrea M"] [" McCrory P," "McCrory P"] [" Iverson GL," "Iverson GL"] [" Lovell MR," "Lovell MR"] [" Bruce JM," "Bruce JM"] [" Pellman EJ," "Pellman EJ"] [" Iverson GL," "Iverson GL"] [" Lovell MR," "Lovell MR"] [" Kontos A," "Kontos A"] [" Collins MW," "Collins MW"] [" Iverson GL," "Iverson GL"] [" Lovell M," "Lovell M"] [" Field M," "Field M"] [" Covassin T," "Covassin T"] [" Iverson GL," "Iverson GL"] [" Lovell MR," "Lovell MR"] [" Collins MW," "Collins MW"] [" Lovell MR," "Lovell MR"] [" Collins MW," "Collins MW"] [" Collins MW," "Collins MW"] [" Collins MW," "Collins MW"] [" Maroon JC," "Maroon JC"] [" Lovell MR," "Lovell MR"] [" Lovell MR." "Lovell MR"] [" Aubry M," "Aubry M"] [" Grindel SH," "Grindel SH"] [" Collins MW," "Collins MW"] [" Lovell MR," "Lovell MR"] [" Collins MW," "Collins MW"] [" Lovell MR," "Lovell MR"])