1

I'm working on a DSpace 5.10 repository with PostgrSQL 9.x. The problem is that when harvested, there are a lot of items that lack metadata required by the regulating entity of my country. Is there a way to bring up which itemID's don't have a specific field?

For example:

I need a query that gives me as result all the resource_id that don't have a metadatafield_id = X. A same resource_id has many metadata_field_id entries.

Thanks a lot.

  • 1
    Please review [ask] Using that as a template for your question greatly increases your chance of getting a satisfactory answer. In particular post your table DDL, sample test data as [formatted text](https://www.worldometers.info/) - **no images**, and the expected results from that data. Show what you have tried and be specific about what issue you have. – Belayer Apr 17 '20 at 18:59
  • Unrelated, but: "9.x" covers **7 major versions** many of them [no longer supported](https://www.postgresql.org/support/versioning/). Although probably not important for this question, you should tell us your exact version nevertheless. –  Apr 20 '20 at 09:11

2 Answers2

0

If I'm understanding you properly:

  1. You're looking to return all resource_id that don't have X in the field metadatafield_id.
  2. There are multiple rows per resource_id, but only some of those rows don't contain X in their respective metadata_field_id columns.

If so, try this:

SELECT distinct resource_id
FROM your_table_name
WHERE metadata_field_id != 'X'

By using distinct, you remove all duplicate rows. In this way, you'll only return unique resource_id. Without using distinct, you will return duplicate entries for resource_id in your result.

Here is the PostgreSQL documentation for distinct.

EDIT: distinct is only supported on PostgreSQL versions 9.5+

  • `distinct` has been supported by all Postgres versions since [at least 7.1](https://www.postgresql.org/docs/7.1/sql-select.html#R1-SQL-SELECT-1) –  Apr 20 '20 at 09:09
0

You need to get the list of all items that doesn't have a specific metadata, so the easier way is to exclude from the complete list the ones that actually have such metadata

select item_id from item where item_id not in 
(
   select resource_id from resourcepolicy where 
      resource_type_id = 2 and metadata_field_id = ?
);
abollini
  • 66
  • 2