Skip to content

Conversation

@williamFalcon
Copy link
Contributor

@williamFalcon williamFalcon commented Jun 13, 2020

  • Enables prepare_data only from global rank 0 if requested or from each node's rank 0
  • Clarifies local_rank vs global_rank
  • adds self.is_global_zero for ease of ops only for node=0, local=0

Fixes #1878
Fixes #2163

@pep8speaks
Copy link

pep8speaks commented Jun 13, 2020

Hello @williamFalcon! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-06-13 02:57:12 UTC

@williamFalcon williamFalcon changed the title Ananthsub patch 1 [WIP] Ananthsub patch 1 Jun 13, 2020
@mergify mergify bot requested a review from a team June 13, 2020 00:33
@williamFalcon williamFalcon changed the title [WIP] Ananthsub patch 1 Finish Ananthsub patch 1 (enable prepare_data from correct processes). clarify local vs global rank Jun 13, 2020
@williamFalcon williamFalcon added this to the 0.8.0 milestone Jun 13, 2020
@williamFalcon williamFalcon added the priority: 0 High priority task label Jun 13, 2020
@mergify mergify bot requested a review from a team June 13, 2020 02:30
('max_epochs', (<class 'int'>,), 1000),
...
('precision', (<class 'int'>,), 32),
('prepare_data_per_node', (<class 'bool'>,), True),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is in the wrong position. it should go after auto_scale_batch, which is not listed here. so the solution is to remove it.
also you removed
doctest: +ELLIPSIS +NORMALIZE_WHITESPACE
which I am fairly confident is needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is it here?
if i remove it, it still fails

Copy link
Contributor

@awaelchli awaelchli Jun 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well maybe I was wrong and it is just the missing +ELLIPSIS +NORMALIZE_WHITESPACE
cloned and doctest runs locally.
so i think its ok now

@mergify mergify bot requested a review from a team June 13, 2020 13:00
@williamFalcon williamFalcon merged commit 5fd01b0 into master Jun 13, 2020
@Borda Borda deleted the ananthsub-patch-1 branch June 13, 2020 16:32
@Borda Borda changed the title Finish Ananthsub patch 1 (enable prepare_data from correct processes). clarify local vs global rank enable prepare_data from correct processes - clarify local vs global rank Jun 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority: 0 High priority task

Projects

None yet

Development

Successfully merging this pull request may close these issues.

prepare_data called multiple times per node for slurm and elastic training

6 participants