I am exploring Software Heritage Graph Dataset recently, the data I need is huge up to 30+GB. To make the analysis more smoothly, I start learning PostgreSQL to help store, manipulate and retrieve data. This blog would be a summary for what I learn/use frequently, basically the commands.

connect database: psql \c DBNAME
quit database: \q
get help: help or \?(press q) or psql --help
list of databases: \l


create database: CREATE DATABASE NAME;
delete database: DROP DATABASE NAME;
describe database/table: \d or \d table name or \dt(just table)
create table: CREATE TABLE table_name (
    column name + data type + constraints if any,
    ... 
)
drop table: DROP TABLE NAME;

insert records into tables: INSERT INTO NAME (
    column name, ...
)
VALUES (value1, ...);

select all: SELECT * FROM NAME;
order by: ORDER BY column name (ASC/DESC);

Copying a Query Result Set: COPY ([Query]) TO '[File Name]' DELIMITER ',' CSV HEADER;

Prefer use UPPERCASE syntax for SQL commands

Some frequently used Linux command

find . -type f -exec mv '{}' '{}'.jpg \;

cat *.csv >combined.csv

scp /file/to/send username@remote:/where/to/put

scp username@remote:/file/to/send /where/to/put