Resolve issues #5 and #1: reduce number of collisions in the ptrack map by ololobus · Pull Request #6 · postgrespro/ptrack

ololobus · 2021-04-22T20:13:27Z

Resolve #5
Resolve #1

…slots. Previously we thought that 1 MB can track changes page-to-page in the 1 GB of data files. However, recently it became evident that our ptrack map or basic hash table behaves more like a Bloom filter with a number of hash functions k = 1. See more here: https://en.wikipedia.org/wiki/Bloom_filter#Probability_of_false_positives. Such filter has naturally more collisions. By storing update_lsn of each block in the additional slot we perform as a Bloom filter with k = 2, which significatly reduces collision rate.

Also bump extversion to 2.2

funny-falcon · 2021-04-25T20:53:03Z

ptrack.c

+		update_lsn1 = pg_atomic_read_u64(&ptrack_map->entries[slot1]);
+		update_lsn2 = pg_atomic_read_u64(&ptrack_map->entries[slot2]);


It is better to fetch and check slot1 first, and only if check passed then fetch and check slot2.
This way you will save TLB and cache misses for slot2 for most of page items.
Note that compiler could not optimize/reorder atomic instructions.

OK, I hope that I did it

Probe the second slot only if the first one succeded.

funny-falcon · 2021-05-13T08:45:15Z

ptrack--2.1--2.2.sql

+	FROM
+		(SELECT count(path) AS changed_files,
+				sum(
+					length(replace(right((pagemap)::text, -1)::varbit::text, '0', ''))


Если таблицы 8TB, то вот эта строчка потребует выделение 1GB памяти для преобразования ::varbit::text.
Соответственно, таблица 16TB потребует уже 2GB памяти, и постгресс просто сам не позволит этого сделать.

Это очень грустно, что varbit не имеет функции countbits.

В любом случае, для ptrack_get_change_stat и ptrack_get_change_file_stat кажется нужно создать ptrack_get_pagecount (ну или другое название).
Или даже просто реализовать ptrack_get_change_file_stat полностью в сишке.

Таблицы же разбиты на сегменты по 1 ГБ дефолтно, а ptrack_get_pagemapset() выдаёт изначально битмапы per file/segment, то есть потребуется максимум в 1000 раз меньше памяти на каждое преобразование. Разве нет?

А ок. Я ещё не посмотрел ptrack_get_pagemapset() .

Слушай, но я бы всё равно поменял бы ptrack_get_pagemapset, добавив поле count в вывод.
pg_probackup при этом не поломается, т.к. он указывает поля, которые хочет.

Сделал

funny-falcon · 2021-05-13T13:53:09Z

engine.c

+
+			/* Delete and try again */
+			durable_unlink(ptrack_path, LOG);
+			is_new_map = true;


Не могу найти, где делается unmap в этом случае?
При этом сразу после метки ptrack_map_reinit делается durable_unlink(ptrack_mmap_path).
В итоге, этот файл повисает невидимкой в файловой системе, и в адрессном пространстве процесса повисает его mmap.

Наверное есть смысл позвать здесь ptrackCleanFilesAndMap ?

Да, похоже на то. Я сомневался в этом месте, но потом забыл и не разобрался до конца

ololobus · 2021-05-16T17:00:22Z

Everything seems to be working, so I'm merging this one. If the internal QA finds out anything, we will fix it in master or with another PR

ololobus force-pushed the double_slot branch 3 times, most recently from 4c32e9e to 38aa439 Compare April 22, 2021 21:33

Resolve issue#1: add ptrack_get_change_stat().

829f96c

Also bump extversion to 2.2

ololobus force-pushed the double_slot branch from 38aa439 to 829f96c Compare April 22, 2021 21:35

ololobus changed the title ~~Resolve issue#5: reduce number of collisions in the ptrack map~~ Resolve issue #5 and #1: reduce number of collisions in the ptrack map Apr 22, 2021

ololobus changed the title ~~Resolve issue #5 and #1: reduce number of collisions in the ptrack map~~ Resolve issues #5 and #1: reduce number of collisions in the ptrack map Apr 22, 2021

funny-falcon reviewed Apr 25, 2021

View reviewed changes

ololobus added 2 commits May 12, 2021 20:02

Add new function ptrack_get_change_file_stat(start_lsn pg_lsn)

3026be9

Slightly optimize ptrack_get_pagemapset

cf8e309

Probe the second slot only if the first one succeded.

funny-falcon reviewed May 13, 2021

View reviewed changes

ololobus added 3 commits May 13, 2021 18:56

Do a proper cleanup when ptrack.map version is incompatible

fbfba8c

Correct some typos

ab17447

Refactor stats API and remove ptrack_get_change_file_stat

9c132a3

ololobus merged commit 708c8e2 into master May 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolve issues #5 and #1: reduce number of collisions in the ptrack map#6

Resolve issues #5 and #1: reduce number of collisions in the ptrack map#6
ololobus merged 7 commits intomasterfrom
double_slot

ololobus commented Apr 22, 2021 •

edited

Loading

Uh oh!

funny-falcon Apr 25, 2021

Uh oh!

ololobus May 12, 2021

Uh oh!

funny-falcon May 13, 2021

Uh oh!

funny-falcon May 13, 2021

Uh oh!

funny-falcon May 13, 2021 •

edited

Loading

Uh oh!

ololobus May 13, 2021

Uh oh!

funny-falcon May 13, 2021

Uh oh!

funny-falcon May 13, 2021

Uh oh!

ololobus May 13, 2021

Uh oh!

funny-falcon May 13, 2021

Uh oh!

ololobus May 13, 2021

Uh oh!

ololobus commented May 16, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		update_lsn1 = pg_atomic_read_u64(&ptrack_map->entries[slot1]);
		update_lsn2 = pg_atomic_read_u64(&ptrack_map->entries[slot2]);

Conversation

ololobus commented Apr 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

funny-falcon May 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ololobus commented May 16, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ololobus commented Apr 22, 2021 •

edited

Loading

funny-falcon May 13, 2021 •

edited

Loading