You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

5004 lines
137 KiB

Streaming output truncated to the last 5000 lines.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.56949 (+1.18181)
| > avg_decoder_loss: 24.96291 (-0.01439)
| > avg_postnet_loss: 15.73567 (-0.16655)
| > avg_stopnet_loss: 0.76924 (-0.00075)
| > avg_loss: 10.94389 (-0.04599)
| > avg_align_error: 0.93381 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_220.pth
> EPOCH: 110/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:30:20)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.94934 (24.94934)
| > postnet_loss: 15.56663 (15.56663)
| > stopnet_loss: 0.76870 (0.76870)
| > loss: 10.89769 (10.89769)
| > align_error: 0.93383 (0.93383)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.40486 (-1.16463)
| > avg_decoder_loss: 24.94934 (-0.01357)
| > avg_postnet_loss: 15.56663 (-0.16904)
| > avg_stopnet_loss: 0.76870 (-0.00054)
| > avg_loss: 10.89769 (-0.04620)
| > avg_align_error: 0.93383 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_222.pth
> EPOCH: 111/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:30:36)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.93512 (24.93512)
| > postnet_loss: 15.41228 (15.41228)
| > stopnet_loss: 0.76803 (0.76803)
| > loss: 10.85488 (10.85488)
| > align_error: 0.93386 (0.93386)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.38025 (-0.02460)
| > avg_decoder_loss: 24.93512 (-0.01423)
| > avg_postnet_loss: 15.41228 (-0.15435)
| > avg_stopnet_loss: 0.76803 (-0.00067)
| > avg_loss: 10.85488 (-0.04281)
| > avg_align_error: 0.93386 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_224.pth
> EPOCH: 112/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:30:52)
--> STEP: 1/2 -- GLOBAL_STEP: 225
| > decoder_loss: 30.33993 (30.33993)
| > postnet_loss: 26.17014 (26.17014)
| > stopnet_loss: 0.76161 (0.76161)
| > loss: 14.88913 (14.88913)
| > align_error: 0.97002 (0.97002)
| > grad_norm: 8.83697 (8.83697)
| > current_lr: 0.00000
| > step_time: 2.75190 (2.75195)
| > loader_time: 0.00990 (0.00987)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.91999 (24.91999)
| > postnet_loss: 15.25691 (15.25691)
| > stopnet_loss: 0.76717 (0.76717)
| > loss: 10.81139 (10.81139)
| > align_error: 0.93388 (0.93388)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.39298 (+0.01273)
| > avg_decoder_loss: 24.91999 (-0.01512)
| > avg_postnet_loss: 15.25691 (-0.15536)
| > avg_stopnet_loss: 0.76717 (-0.00087)
| > avg_loss: 10.81139 (-0.04349)
| > avg_align_error: 0.93388 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_226.pth
> EPOCH: 113/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:31:07)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.90399 (24.90399)
| > postnet_loss: 15.09707 (15.09707)
| > stopnet_loss: 0.76637 (0.76637)
| > loss: 10.76663 (10.76663)
| > align_error: 0.93391 (0.93391)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.38124 (-0.01174)
| > avg_decoder_loss: 24.90399 (-0.01600)
| > avg_postnet_loss: 15.09707 (-0.15985)
| > avg_stopnet_loss: 0.76637 (-0.00080)
| > avg_loss: 10.76663 (-0.04476)
| > avg_align_error: 0.93391 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_228.pth
> EPOCH: 114/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:31:23)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.88791 (24.88791)
| > postnet_loss: 14.94401 (14.94401)
| > stopnet_loss: 0.76569 (0.76569)
| > loss: 10.72367 (10.72367)
| > align_error: 0.93394 (0.93394)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.38205 (+0.00081)
| > avg_decoder_loss: 24.88791 (-0.01609)
| > avg_postnet_loss: 14.94401 (-0.15305)
| > avg_stopnet_loss: 0.76569 (-0.00067)
| > avg_loss: 10.72367 (-0.04296)
| > avg_align_error: 0.93394 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_230.pth
> EPOCH: 115/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:31:38)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.87119 (24.87119)
| > postnet_loss: 14.80095 (14.80095)
| > stopnet_loss: 0.76479 (0.76479)
| > loss: 10.68282 (10.68282)
| > align_error: 0.93396 (0.93396)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.39776 (+0.01571)
| > avg_decoder_loss: 24.87119 (-0.01672)
| > avg_postnet_loss: 14.80095 (-0.14306)
| > avg_stopnet_loss: 0.76479 (-0.00091)
| > avg_loss: 10.68282 (-0.04085)
| > avg_align_error: 0.93396 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_232.pth
> EPOCH: 116/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:31:53)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.85367 (24.85367)
| > postnet_loss: 14.65728 (14.65728)
| > stopnet_loss: 0.76392 (0.76392)
| > loss: 10.64166 (10.64166)
| > align_error: 0.93399 (0.93399)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.38093 (-0.01683)
| > avg_decoder_loss: 24.85367 (-0.01752)
| > avg_postnet_loss: 14.65728 (-0.14366)
| > avg_stopnet_loss: 0.76392 (-0.00086)
| > avg_loss: 10.64166 (-0.04116)
| > avg_align_error: 0.93399 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_234.pth
> EPOCH: 117/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:32:09)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.83603 (24.83603)
| > postnet_loss: 14.52243 (14.52243)
| > stopnet_loss: 0.76315 (0.76315)
| > loss: 10.60277 (10.60277)
| > align_error: 0.93401 (0.93401)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.70377 (+0.32284)
| > avg_decoder_loss: 24.83603 (-0.01763)
| > avg_postnet_loss: 14.52243 (-0.13486)
| > avg_stopnet_loss: 0.76315 (-0.00077)
| > avg_loss: 10.60277 (-0.03890)
| > avg_align_error: 0.93401 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_236.pth
> EPOCH: 118/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:32:38)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.81760 (24.81760)
| > postnet_loss: 14.39176 (14.39176)
| > stopnet_loss: 0.76246 (0.76246)
| > loss: 10.56480 (10.56480)
| > align_error: 0.93404 (0.93404)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.57390 (-0.12987)
| > avg_decoder_loss: 24.81760 (-0.01844)
| > avg_postnet_loss: 14.39176 (-0.13066)
| > avg_stopnet_loss: 0.76246 (-0.00069)
| > avg_loss: 10.56480 (-0.03797)
| > avg_align_error: 0.93404 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_238.pth
> EPOCH: 119/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:32:54)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.79859 (24.79859)
| > postnet_loss: 14.25896 (14.25896)
| > stopnet_loss: 0.76150 (0.76150)
| > loss: 10.52589 (10.52589)
| > align_error: 0.93407 (0.93407)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.53127 (-0.04263)
| > avg_decoder_loss: 24.79859 (-0.01901)
| > avg_postnet_loss: 14.25896 (-0.13280)
| > avg_stopnet_loss: 0.76150 (-0.00096)
| > avg_loss: 10.52589 (-0.03891)
| > avg_align_error: 0.93407 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_240.pth
> EPOCH: 120/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:33:10)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.77885 (24.77885)
| > postnet_loss: 14.14054 (14.14054)
| > stopnet_loss: 0.76049 (0.76049)
| > loss: 10.49034 (10.49034)
| > align_error: 0.93410 (0.93410)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.52745 (-0.00382)
| > avg_decoder_loss: 24.77885 (-0.01974)
| > avg_postnet_loss: 14.14054 (-0.11842)
| > avg_stopnet_loss: 0.76049 (-0.00101)
| > avg_loss: 10.49034 (-0.03555)
| > avg_align_error: 0.93410 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_242.pth
> EPOCH: 121/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:33:26)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.75903 (24.75903)
| > postnet_loss: 14.01721 (14.01721)
| > stopnet_loss: 0.75958 (0.75958)
| > loss: 10.45364 (10.45364)
| > align_error: 0.93413 (0.93413)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.54692 (+0.01947)
| > avg_decoder_loss: 24.75903 (-0.01982)
| > avg_postnet_loss: 14.01721 (-0.12334)
| > avg_stopnet_loss: 0.75958 (-0.00091)
| > avg_loss: 10.45364 (-0.03670)
| > avg_align_error: 0.93413 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_244.pth
> EPOCH: 122/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:33:41)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.73801 (24.73801)
| > postnet_loss: 13.89961 (13.89961)
| > stopnet_loss: 0.75879 (0.75879)
| > loss: 10.41820 (10.41820)
| > align_error: 0.93415 (0.93415)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.51816 (-0.02876)
| > avg_decoder_loss: 24.73801 (-0.02101)
| > avg_postnet_loss: 13.89961 (-0.11760)
| > avg_stopnet_loss: 0.75879 (-0.00079)
| > avg_loss: 10.41820 (-0.03544)
| > avg_align_error: 0.93415 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_246.pth
> EPOCH: 123/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:33:56)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.71552 (24.71552)
| > postnet_loss: 13.79109 (13.79109)
| > stopnet_loss: 0.75797 (0.75797)
| > loss: 10.38462 (10.38462)
| > align_error: 0.93418 (0.93418)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.59217 (+0.07401)
| > avg_decoder_loss: 24.71552 (-0.02249)
| > avg_postnet_loss: 13.79109 (-0.10851)
| > avg_stopnet_loss: 0.75797 (-0.00082)
| > avg_loss: 10.38462 (-0.03358)
| > avg_align_error: 0.93418 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_248.pth
> EPOCH: 124/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:34:11)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.69184 (24.69184)
| > postnet_loss: 13.67974 (13.67974)
| > stopnet_loss: 0.75692 (0.75692)
| > loss: 10.34981 (10.34981)
| > align_error: 0.93421 (0.93421)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.54020 (-0.05197)
| > avg_decoder_loss: 24.69184 (-0.02368)
| > avg_postnet_loss: 13.67974 (-0.11136)
| > avg_stopnet_loss: 0.75692 (-0.00105)
| > avg_loss: 10.34981 (-0.03481)
| > avg_align_error: 0.93421 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_250.pth
> EPOCH: 125/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:34:28)
--> STEP: 0/2 -- GLOBAL_STEP: 250
| > decoder_loss: 28.14487 (28.14487)
| > postnet_loss: 23.95758 (23.95758)
| > stopnet_loss: 0.74521 (0.74521)
| > loss: 13.77082 (13.77082)
| > align_error: 0.95789 (0.95789)
| > grad_norm: 6.26954 (6.26954)
| > current_lr: 0.00000
| > step_time: 1.95020 (1.95022)
| > loader_time: 3.93320 (3.93319)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.66791 (24.66791)
| > postnet_loss: 13.57413 (13.57413)
| > stopnet_loss: 0.75562 (0.75562)
| > loss: 10.31613 (10.31613)
| > align_error: 0.93424 (0.93424)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.70751 (+0.16731)
| > avg_decoder_loss: 24.66791 (-0.02394)
| > avg_postnet_loss: 13.57413 (-0.10561)
| > avg_stopnet_loss: 0.75562 (-0.00129)
| > avg_loss: 10.31613 (-0.03368)
| > avg_align_error: 0.93424 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_252.pth
> EPOCH: 126/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:34:57)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.64308 (24.64308)
| > postnet_loss: 13.46872 (13.46872)
| > stopnet_loss: 0.75464 (0.75464)
| > loss: 10.28259 (10.28259)
| > align_error: 0.93427 (0.93427)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.40032 (-0.30719)
| > avg_decoder_loss: 24.64308 (-0.02483)
| > avg_postnet_loss: 13.46872 (-0.10541)
| > avg_stopnet_loss: 0.75464 (-0.00099)
| > avg_loss: 10.28259 (-0.03355)
| > avg_align_error: 0.93427 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_254.pth
> EPOCH: 127/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:35:13)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.61713 (24.61713)
| > postnet_loss: 13.37599 (13.37599)
| > stopnet_loss: 0.75351 (0.75351)
| > loss: 10.25179 (10.25179)
| > align_error: 0.93430 (0.93430)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.38094 (-0.01938)
| > avg_decoder_loss: 24.61713 (-0.02595)
| > avg_postnet_loss: 13.37599 (-0.09273)
| > avg_stopnet_loss: 0.75351 (-0.00113)
| > avg_loss: 10.25179 (-0.03080)
| > avg_align_error: 0.93430 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_256.pth
> EPOCH: 128/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:35:29)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.59013 (24.59013)
| > postnet_loss: 13.29527 (13.29527)
| > stopnet_loss: 0.75232 (0.75232)
| > loss: 10.22367 (10.22367)
| > align_error: 0.93433 (0.93433)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.40416 (+0.02321)
| > avg_decoder_loss: 24.59013 (-0.02699)
| > avg_postnet_loss: 13.29527 (-0.08072)
| > avg_stopnet_loss: 0.75232 (-0.00119)
| > avg_loss: 10.22367 (-0.02811)
| > avg_align_error: 0.93433 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_258.pth
> EPOCH: 129/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:35:44)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.56199 (24.56199)
| > postnet_loss: 13.20905 (13.20905)
| > stopnet_loss: 0.75101 (0.75101)
| > loss: 10.19377 (10.19377)
| > align_error: 0.93436 (0.93436)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.37682 (-0.02734)
| > avg_decoder_loss: 24.56199 (-0.02814)
| > avg_postnet_loss: 13.20905 (-0.08622)
| > avg_stopnet_loss: 0.75101 (-0.00132)
| > avg_loss: 10.19377 (-0.02991)
| > avg_align_error: 0.93436 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_260.pth
> EPOCH: 130/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:36:00)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.53213 (24.53213)
| > postnet_loss: 13.12643 (13.12643)
| > stopnet_loss: 0.74978 (0.74978)
| > loss: 10.16441 (10.16441)
| > align_error: 0.93439 (0.93439)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.02092 (+0.64410)
| > avg_decoder_loss: 24.53213 (-0.02987)
| > avg_postnet_loss: 13.12643 (-0.08262)
| > avg_stopnet_loss: 0.74978 (-0.00123)
| > avg_loss: 10.16441 (-0.02935)
| > avg_align_error: 0.93439 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_262.pth
> EPOCH: 131/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:36:17)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.50153 (24.50153)
| > postnet_loss: 13.04998 (13.04998)
| > stopnet_loss: 0.74841 (0.74841)
| > loss: 10.13629 (10.13629)
| > align_error: 0.93442 (0.93442)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.56871 (-0.45221)
| > avg_decoder_loss: 24.50153 (-0.03060)
| > avg_postnet_loss: 13.04998 (-0.07645)
| > avg_stopnet_loss: 0.74841 (-0.00137)
| > avg_loss: 10.13629 (-0.02813)
| > avg_align_error: 0.93442 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_264.pth
> EPOCH: 132/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:36:32)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.47037 (24.47037)
| > postnet_loss: 12.97397 (12.97397)
| > stopnet_loss: 0.74710 (0.74710)
| > loss: 10.10819 (10.10819)
| > align_error: 0.93445 (0.93445)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.55167 (-0.01704)
| > avg_decoder_loss: 24.47037 (-0.03116)
| > avg_postnet_loss: 12.97397 (-0.07601)
| > avg_stopnet_loss: 0.74710 (-0.00131)
| > avg_loss: 10.10819 (-0.02810)
| > avg_align_error: 0.93445 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_266.pth
> EPOCH: 133/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:36:47)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.43732 (24.43732)
| > postnet_loss: 12.90439 (12.90439)
| > stopnet_loss: 0.74572 (0.74572)
| > loss: 10.08115 (10.08115)
| > align_error: 0.93449 (0.93449)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.35288 (+0.80121)
| > avg_decoder_loss: 24.43732 (-0.03306)
| > avg_postnet_loss: 12.90439 (-0.06958)
| > avg_stopnet_loss: 0.74572 (-0.00138)
| > avg_loss: 10.08115 (-0.02704)
| > avg_align_error: 0.93449 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_268.pth
> EPOCH: 134/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:37:16)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.40205 (24.40205)
| > postnet_loss: 12.84301 (12.84301)
| > stopnet_loss: 0.74424 (0.74424)
| > loss: 10.05551 (10.05551)
| > align_error: 0.93452 (0.93452)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.37588 (-0.97701)
| > avg_decoder_loss: 24.40205 (-0.03526)
| > avg_postnet_loss: 12.84301 (-0.06138)
| > avg_stopnet_loss: 0.74424 (-0.00148)
| > avg_loss: 10.05551 (-0.02564)
| > avg_align_error: 0.93452 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_270.pth
> EPOCH: 135/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:37:31)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.36577 (24.36577)
| > postnet_loss: 12.78301 (12.78301)
| > stopnet_loss: 0.74244 (0.74244)
| > loss: 10.02964 (10.02964)
| > align_error: 0.93455 (0.93455)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.79215 (+0.41627)
| > avg_decoder_loss: 24.36577 (-0.03628)
| > avg_postnet_loss: 12.78301 (-0.05999)
| > avg_stopnet_loss: 0.74244 (-0.00180)
| > avg_loss: 10.02964 (-0.02587)
| > avg_align_error: 0.93455 (+0.00003)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_272.pth
> EPOCH: 136/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:37:47)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.32769 (24.32769)
| > postnet_loss: 12.72485 (12.72485)
| > stopnet_loss: 0.74075 (0.74075)
| > loss: 10.00389 (10.00389)
| > align_error: 0.93459 (0.93459)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.38061 (-0.41154)
| > avg_decoder_loss: 24.32769 (-0.03808)
| > avg_postnet_loss: 12.72485 (-0.05816)
| > avg_stopnet_loss: 0.74075 (-0.00169)
| > avg_loss: 10.00389 (-0.02575)
| > avg_align_error: 0.93459 (+0.00004)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_274.pth
> EPOCH: 137/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:38:02)
--> STEP: 1/2 -- GLOBAL_STEP: 275
| > decoder_loss: 29.78144 (29.78144)
| > postnet_loss: 22.82139 (22.82139)
| > stopnet_loss: 0.73713 (0.73713)
| > loss: 13.88784 (13.88784)
| > align_error: 0.97020 (0.97020)
| > grad_norm: 5.63345 (5.63345)
| > current_lr: 0.00000
| > step_time: 2.32140 (2.32137)
| > loader_time: 0.00550 (0.00552)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.28755 (24.28755)
| > postnet_loss: 12.66898 (12.66898)
| > stopnet_loss: 0.73894 (0.73894)
| > loss: 9.97807 (9.97807)
| > align_error: 0.93462 (0.93462)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.38795 (+0.00734)
| > avg_decoder_loss: 24.28755 (-0.04014)
| > avg_postnet_loss: 12.66898 (-0.05587)
| > avg_stopnet_loss: 0.73894 (-0.00182)
| > avg_loss: 9.97807 (-0.02582)
| > avg_align_error: 0.93462 (+0.00004)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_276.pth
> EPOCH: 138/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:38:17)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.24691 (24.24691)
| > postnet_loss: 12.61333 (12.61333)
| > stopnet_loss: 0.73689 (0.73689)
| > loss: 9.95195 (9.95195)
| > align_error: 0.93466 (0.93466)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.38630 (-0.00165)
| > avg_decoder_loss: 24.24691 (-0.04064)
| > avg_postnet_loss: 12.61333 (-0.05565)
| > avg_stopnet_loss: 0.73689 (-0.00204)
| > avg_loss: 9.95195 (-0.02612)
| > avg_align_error: 0.93466 (+0.00004)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_278.pth
> EPOCH: 139/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:38:32)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.20509 (24.20509)
| > postnet_loss: 12.57103 (12.57103)
| > stopnet_loss: 0.73505 (0.73505)
| > loss: 9.92908 (9.92908)
| > align_error: 0.93470 (0.93470)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 2.12122 (+1.73492)
| > avg_decoder_loss: 24.20509 (-0.04181)
| > avg_postnet_loss: 12.57103 (-0.04230)
| > avg_stopnet_loss: 0.73505 (-0.00184)
| > avg_loss: 9.92908 (-0.02287)
| > avg_align_error: 0.93470 (+0.00004)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_280.pth
> EPOCH: 140/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:40:05)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.16056 (24.16056)
| > postnet_loss: 12.52215 (12.52215)
| > stopnet_loss: 0.73308 (0.73308)
| > loss: 9.90376 (9.90376)
| > align_error: 0.93473 (0.93473)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.53318 (-1.58804)
| > avg_decoder_loss: 24.16056 (-0.04454)
| > avg_postnet_loss: 12.52215 (-0.04887)
| > avg_stopnet_loss: 0.73308 (-0.00197)
| > avg_loss: 9.90376 (-0.02532)
| > avg_align_error: 0.93473 (+0.00004)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_282.pth
> EPOCH: 141/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:40:21)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.11544 (24.11544)
| > postnet_loss: 12.47741 (12.47741)
| > stopnet_loss: 0.73080 (0.73080)
| > loss: 9.87901 (9.87901)
| > align_error: 0.93477 (0.93477)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.54237 (+0.00919)
| > avg_decoder_loss: 24.11544 (-0.04511)
| > avg_postnet_loss: 12.47741 (-0.04475)
| > avg_stopnet_loss: 0.73080 (-0.00229)
| > avg_loss: 9.87901 (-0.02475)
| > avg_align_error: 0.93477 (+0.00004)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_284.pth
> EPOCH: 142/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:40:36)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.06737 (24.06737)
| > postnet_loss: 12.43430 (12.43430)
| > stopnet_loss: 0.72849 (0.72849)
| > loss: 9.85390 (9.85390)
| > align_error: 0.93481 (0.93481)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.54418 (+0.00181)
| > avg_decoder_loss: 24.06737 (-0.04808)
| > avg_postnet_loss: 12.43430 (-0.04311)
| > avg_stopnet_loss: 0.72849 (-0.00231)
| > avg_loss: 9.85390 (-0.02511)
| > avg_align_error: 0.93481 (+0.00004)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_286.pth
> EPOCH: 143/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:40:52)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 24.01711 (24.01711)
| > postnet_loss: 12.39214 (12.39214)
| > stopnet_loss: 0.72584 (0.72584)
| > loss: 9.82815 (9.82815)
| > align_error: 0.93485 (0.93485)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.54488 (+0.00070)
| > avg_decoder_loss: 24.01711 (-0.05025)
| > avg_postnet_loss: 12.39214 (-0.04216)
| > avg_stopnet_loss: 0.72584 (-0.00265)
| > avg_loss: 9.82815 (-0.02575)
| > avg_align_error: 0.93485 (+0.00004)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_288.pth
> EPOCH: 144/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:41:07)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 23.96517 (23.96517)
| > postnet_loss: 12.35137 (12.35137)
| > stopnet_loss: 0.72327 (0.72327)
| > loss: 9.80241 (9.80241)
| > align_error: 0.93489 (0.93489)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.47806 (+0.93318)
| > avg_decoder_loss: 23.96517 (-0.05194)
| > avg_postnet_loss: 12.35137 (-0.04077)
| > avg_stopnet_loss: 0.72327 (-0.00257)
| > avg_loss: 9.80241 (-0.02574)
| > avg_align_error: 0.93489 (+0.00004)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_290.pth
> EPOCH: 145/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:42:35)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 23.91213 (23.91213)
| > postnet_loss: 12.32036 (12.32036)
| > stopnet_loss: 0.72093 (0.72093)
| > loss: 9.77905 (9.77905)
| > align_error: 0.93493 (0.93493)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.64957 (+0.17150)
| > avg_decoder_loss: 23.91213 (-0.05304)
| > avg_postnet_loss: 12.32036 (-0.03101)
| > avg_stopnet_loss: 0.72093 (-0.00234)
| > avg_loss: 9.77905 (-0.02336)
| > avg_align_error: 0.93493 (+0.00004)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_292.pth
> EPOCH: 146/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:44:04)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 23.85706 (23.85706)
| > postnet_loss: 12.28907 (12.28907)
| > stopnet_loss: 0.71845 (0.71845)
| > loss: 9.75498 (9.75498)
| > align_error: 0.93497 (0.93497)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.65575 (+0.00618)
| > avg_decoder_loss: 23.85706 (-0.05508)
| > avg_postnet_loss: 12.28907 (-0.03129)
| > avg_stopnet_loss: 0.71845 (-0.00248)
| > avg_loss: 9.75498 (-0.02408)
| > avg_align_error: 0.93497 (+0.00004)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_294.pth
> EPOCH: 147/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:45:30)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 23.79927 (23.79927)
| > postnet_loss: 12.25081 (12.25081)
| > stopnet_loss: 0.71556 (0.71556)
| > loss: 9.72808 (9.72808)
| > align_error: 0.93501 (0.93501)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.40748 (-1.24827)
| > avg_decoder_loss: 23.79927 (-0.05779)
| > avg_postnet_loss: 12.25081 (-0.03826)
| > avg_stopnet_loss: 0.71556 (-0.00288)
| > avg_loss: 9.72808 (-0.02690)
| > avg_align_error: 0.93501 (+0.00004)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_296.pth
> EPOCH: 148/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:45:47)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 23.73929 (23.73929)
| > postnet_loss: 12.21506 (12.21506)
| > stopnet_loss: 0.71260 (0.71260)
| > loss: 9.70119 (9.70119)
| > align_error: 0.93506 (0.93506)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.37874 (-0.02875)
| > avg_decoder_loss: 23.73929 (-0.05998)
| > avg_postnet_loss: 12.21506 (-0.03575)
| > avg_stopnet_loss: 0.71260 (-0.00296)
| > avg_loss: 9.70119 (-0.02689)
| > avg_align_error: 0.93506 (+0.00004)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_298.pth
> EPOCH: 149/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:46:08)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 23.67859 (23.67859)
| > postnet_loss: 12.18098 (12.18098)
| > stopnet_loss: 0.70969 (0.70969)
| > loss: 9.67458 (9.67458)
| > align_error: 0.93510 (0.93510)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.38760 (+0.00886)
| > avg_decoder_loss: 23.67859 (-0.06070)
| > avg_postnet_loss: 12.18098 (-0.03408)
| > avg_stopnet_loss: 0.70969 (-0.00291)
| > avg_loss: 9.67458 (-0.02660)
| > avg_align_error: 0.93510 (+0.00004)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_300.pth
> EPOCH: 150/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:46:24)
--> STEP: 0/2 -- GLOBAL_STEP: 300
| > decoder_loss: 27.24969 (27.24969)
| > postnet_loss: 21.39986 (21.39986)
| > stopnet_loss: 0.70556 (0.70556)
| > loss: 12.86795 (12.86795)
| > align_error: 0.95804 (0.95804)
| > grad_norm: 4.39074 (4.39074)
| > current_lr: 0.00000
| > step_time: 0.75780 (0.75777)
| > loader_time: 71.91820 (71.91817)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 23.61572 (23.61572)
| > postnet_loss: 12.14773 (12.14773)
| > stopnet_loss: 0.70669 (0.70669)
| > loss: 9.64756 (9.64756)
| > align_error: 0.93514 (0.93514)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.47811 (+1.09051)
| > avg_decoder_loss: 23.61572 (-0.06286)
| > avg_postnet_loss: 12.14773 (-0.03325)
| > avg_stopnet_loss: 0.70669 (-0.00300)
| > avg_loss: 9.64756 (-0.02703)
| > avg_align_error: 0.93514 (+0.00004)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_302.pth
> EPOCH: 151/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:47:52)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 23.54928 (23.54928)
| > postnet_loss: 12.11641 (12.11641)
| > stopnet_loss: 0.70334 (0.70334)
| > loss: 9.61977 (9.61977)
| > align_error: 0.93519 (0.93519)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.52243 (+0.04432)
| > avg_decoder_loss: 23.54928 (-0.06644)
| > avg_postnet_loss: 12.11641 (-0.03132)
| > avg_stopnet_loss: 0.70334 (-0.00335)
| > avg_loss: 9.61977 (-0.02779)
| > avg_align_error: 0.93519 (+0.00005)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_304.pth
> EPOCH: 152/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:49:19)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 23.48001 (23.48001)
| > postnet_loss: 12.08576 (12.08576)
| > stopnet_loss: 0.69999 (0.69999)
| > loss: 9.59144 (9.59144)
| > align_error: 0.93524 (0.93524)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.62266 (+0.10022)
| > avg_decoder_loss: 23.48001 (-0.06927)
| > avg_postnet_loss: 12.08576 (-0.03065)
| > avg_stopnet_loss: 0.69999 (-0.00335)
| > avg_loss: 9.59144 (-0.02833)
| > avg_align_error: 0.93524 (+0.00005)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_306.pth
> EPOCH: 153/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:50:44)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 23.40978 (23.40978)
| > postnet_loss: 12.05594 (12.05594)
| > stopnet_loss: 0.69660 (0.69660)
| > loss: 9.56303 (9.56303)
| > align_error: 0.93528 (0.93528)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.45256 (-0.17010)
| > avg_decoder_loss: 23.40978 (-0.07023)
| > avg_postnet_loss: 12.05594 (-0.02982)
| > avg_stopnet_loss: 0.69660 (-0.00339)
| > avg_loss: 9.56303 (-0.02841)
| > avg_align_error: 0.93528 (+0.00005)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_308.pth
> EPOCH: 154/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:52:11)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 23.33634 (23.33634)
| > postnet_loss: 12.01772 (12.01772)
| > stopnet_loss: 0.69303 (0.69303)
| > loss: 9.53155 (9.53155)
| > align_error: 0.93533 (0.93533)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.45948 (+0.00693)
| > avg_decoder_loss: 23.33634 (-0.07344)
| > avg_postnet_loss: 12.01772 (-0.03822)
| > avg_stopnet_loss: 0.69303 (-0.00357)
| > avg_loss: 9.53155 (-0.03148)
| > avg_align_error: 0.93533 (+0.00005)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_310.pth
> EPOCH: 155/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:53:38)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 23.26028 (23.26028)
| > postnet_loss: 11.98090 (11.98090)
| > stopnet_loss: 0.68924 (0.68924)
| > loss: 9.49953 (9.49953)
| > align_error: 0.93538 (0.93538)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 0.40167 (-1.05781)
| > avg_decoder_loss: 23.26028 (-0.07607)
| > avg_postnet_loss: 11.98090 (-0.03682)
| > avg_stopnet_loss: 0.68924 (-0.00380)
| > avg_loss: 9.49953 (-0.03202)
| > avg_align_error: 0.93538 (+0.00005)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_312.pth
> EPOCH: 156/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:53:54)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 23.18120 (23.18120)
| > postnet_loss: 11.94405 (11.94405)
| > stopnet_loss: 0.68525 (0.68525)
| > loss: 9.46656 (9.46656)
| > align_error: 0.93543 (0.93543)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.50391 (+1.10224)
| > avg_decoder_loss: 23.18120 (-0.07908)
| > avg_postnet_loss: 11.94405 (-0.03685)
| > avg_stopnet_loss: 0.68525 (-0.00399)
| > avg_loss: 9.46656 (-0.03297)
| > avg_align_error: 0.93543 (+0.00005)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_314.pth
> EPOCH: 157/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:55:21)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 23.09868 (23.09868)
| > postnet_loss: 11.90392 (11.90392)
| > stopnet_loss: 0.68117 (0.68117)
| > loss: 9.43182 (9.43182)
| > align_error: 0.93549 (0.93549)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 2.29519 (+0.79127)
| > avg_decoder_loss: 23.09868 (-0.08252)
| > avg_postnet_loss: 11.90392 (-0.04013)
| > avg_stopnet_loss: 0.68117 (-0.00408)
| > avg_loss: 9.43182 (-0.03474)
| > avg_align_error: 0.93549 (+0.00005)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_316.pth
> EPOCH: 158/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 06:56:48)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 23.01368 (23.01368)
| > postnet_loss: 11.86693 (11.86693)
| > stopnet_loss: 0.67710 (0.67710)
| > loss: 9.39726 (9.39726)
| > align_error: 0.93554 (0.93554)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.45273 (-0.84246)
| > avg_decoder_loss: 23.01368 (-0.08500)
| > avg_postnet_loss: 11.86693 (-0.03699)
| > avg_stopnet_loss: 0.67710 (-0.00407)
| > avg_loss: 9.39726 (-0.03457)
| > avg_align_error: 0.93554 (+0.00006)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_318.pth
> EPOCH: 159/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 07:00:31)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 22.92599 (22.92599)
| > postnet_loss: 11.82570 (11.82570)
| > stopnet_loss: 0.67268 (0.67268)
| > loss: 9.36060 (9.36060)
| > align_error: 0.93560 (0.93560)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.48585 (+0.03312)
| > avg_decoder_loss: 22.92599 (-0.08769)
| > avg_postnet_loss: 11.82570 (-0.04123)
| > avg_stopnet_loss: 0.67268 (-0.00443)
| > avg_loss: 9.36060 (-0.03666)
| > avg_align_error: 0.93560 (+0.00006)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_320.pth
> EPOCH: 160/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 07:04:21)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 22.83308 (22.83308)
| > postnet_loss: 11.77874 (11.77874)
| > stopnet_loss: 0.66797 (0.66797)
| > loss: 9.32093 (9.32093)
| > align_error: 0.93566 (0.93566)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.52228 (+0.03644)
| > avg_decoder_loss: 22.83308 (-0.09291)
| > avg_postnet_loss: 11.77874 (-0.04695)
| > avg_stopnet_loss: 0.66797 (-0.00471)
| > avg_loss: 9.32093 (-0.03967)
| > avg_align_error: 0.93566 (+0.00006)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_322.pth
> EPOCH: 161/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 07:10:11)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 22.73636 (22.73636)
| > postnet_loss: 11.72892 (11.72892)
| > stopnet_loss: 0.66331 (0.66331)
| > loss: 9.27963 (9.27963)
| > align_error: 0.93572 (0.93572)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.53309 (+0.01081)
| > avg_decoder_loss: 22.73636 (-0.09672)
| > avg_postnet_loss: 11.72892 (-0.04982)
| > avg_stopnet_loss: 0.66331 (-0.00466)
| > avg_loss: 9.27963 (-0.04130)
| > avg_align_error: 0.93572 (+0.00006)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_324.pth
> EPOCH: 162/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 07:15:59)
--> STEP: 1/2 -- GLOBAL_STEP: 325
| > decoder_loss: 28.35946 (28.35946)
| > postnet_loss: 20.38641 (20.38641)
| > stopnet_loss: 0.68159 (0.68159)
| > loss: 12.86805 (12.86805)
| > align_error: 0.97041 (0.97041)
| > grad_norm: 4.82455 (4.82455)
| > current_lr: 0.00000
| > step_time: 2.79030 (2.79028)
| > loader_time: 0.00630 (0.00632)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 22.63504 (22.63504)
| > postnet_loss: 11.67266 (11.67266)
| > stopnet_loss: 0.65817 (0.65817)
| > loss: 9.23509 (9.23509)
| > align_error: 0.93579 (0.93579)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.43469 (-0.09840)
| > avg_decoder_loss: 22.63504 (-0.10132)
| > avg_postnet_loss: 11.67266 (-0.05626)
| > avg_stopnet_loss: 0.65817 (-0.00514)
| > avg_loss: 9.23509 (-0.04454)
| > avg_align_error: 0.93579 (+0.00006)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_326.pth
> EPOCH: 163/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 07:21:48)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 22.53035 (22.53035)
| > postnet_loss: 11.61585 (11.61585)
| > stopnet_loss: 0.65300 (0.65300)
| > loss: 9.18955 (9.18955)
| > align_error: 0.93585 (0.93585)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.47307 (+0.03838)
| > avg_decoder_loss: 22.53035 (-0.10469)
| > avg_postnet_loss: 11.61585 (-0.05681)
| > avg_stopnet_loss: 0.65300 (-0.00516)
| > avg_loss: 9.18955 (-0.04554)
| > avg_align_error: 0.93585 (+0.00007)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_328.pth
> EPOCH: 164/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 07:27:34)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 22.42075 (22.42075)
| > postnet_loss: 11.55470 (11.55470)
| > stopnet_loss: 0.64737 (0.64737)
| > loss: 9.14123 (9.14123)
| > align_error: 0.93592 (0.93592)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.63643 (+0.16336)
| > avg_decoder_loss: 22.42075 (-0.10960)
| > avg_postnet_loss: 11.55470 (-0.06115)
| > avg_stopnet_loss: 0.64737 (-0.00564)
| > avg_loss: 9.14123 (-0.04832)
| > avg_align_error: 0.93592 (+0.00007)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_330.pth
> EPOCH: 165/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 07:33:22)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 22.30727 (22.30727)
| > postnet_loss: 11.49596 (11.49596)
| > stopnet_loss: 0.64188 (0.64188)
| > loss: 9.09269 (9.09269)
| > align_error: 0.93599 (0.93599)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.76904 (+0.13262)
| > avg_decoder_loss: 22.30727 (-0.11348)
| > avg_postnet_loss: 11.49596 (-0.05874)
| > avg_stopnet_loss: 0.64188 (-0.00548)
| > avg_loss: 9.09269 (-0.04854)
| > avg_align_error: 0.93599 (+0.00007)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_332.pth
> EPOCH: 166/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 07:39:06)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 22.18679 (22.18679)
| > postnet_loss: 11.42594 (11.42594)
| > stopnet_loss: 0.63592 (0.63592)
| > loss: 9.03910 (9.03910)
| > align_error: 0.93607 (0.93607)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.45768 (-0.31136)
| > avg_decoder_loss: 22.18679 (-0.12048)
| > avg_postnet_loss: 11.42594 (-0.07002)
| > avg_stopnet_loss: 0.63592 (-0.00597)
| > avg_loss: 9.03910 (-0.05359)
| > avg_align_error: 0.93607 (+0.00007)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_334.pth
> EPOCH: 167/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 07:45:00)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 22.06155 (22.06155)
| > postnet_loss: 11.35657 (11.35657)
| > stopnet_loss: 0.62971 (0.62971)
| > loss: 8.98425 (8.98425)
| > align_error: 0.93614 (0.93614)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.47007 (+0.01238)
| > avg_decoder_loss: 22.06155 (-0.12524)
| > avg_postnet_loss: 11.35657 (-0.06937)
| > avg_stopnet_loss: 0.62971 (-0.00620)
| > avg_loss: 8.98425 (-0.05485)
| > avg_align_error: 0.93614 (+0.00008)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_336.pth
> EPOCH: 168/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 07:52:57)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 21.93171 (21.93171)
| > postnet_loss: 11.28278 (11.28278)
| > stopnet_loss: 0.62345 (0.62345)
| > loss: 8.92707 (8.92707)
| > align_error: 0.93622 (0.93622)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.45753 (-0.01254)
| > avg_decoder_loss: 21.93171 (-0.12984)
| > avg_postnet_loss: 11.28278 (-0.07380)
| > avg_stopnet_loss: 0.62345 (-0.00627)
| > avg_loss: 8.92707 (-0.05718)
| > avg_align_error: 0.93622 (+0.00008)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_338.pth
> EPOCH: 169/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 08:00:57)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 21.79602 (21.79602)
| > postnet_loss: 11.20800 (11.20800)
| > stopnet_loss: 0.61686 (0.61686)
| > loss: 8.86787 (8.86787)
| > align_error: 0.93631 (0.93631)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.56586 (+0.10833)
| > avg_decoder_loss: 21.79602 (-0.13569)
| > avg_postnet_loss: 11.20800 (-0.07478)
| > avg_stopnet_loss: 0.61686 (-0.00659)
| > avg_loss: 8.86787 (-0.05920)
| > avg_align_error: 0.93631 (+0.00008)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_340.pth
> EPOCH: 170/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 08:08:57)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 21.65388 (21.65388)
| > postnet_loss: 11.13051 (11.13051)
| > stopnet_loss: 0.60959 (0.60959)
| > loss: 8.80569 (8.80569)
| > align_error: 0.93639 (0.93639)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.53192 (-0.03394)
| > avg_decoder_loss: 21.65388 (-0.14214)
| > avg_postnet_loss: 11.13051 (-0.07748)
| > avg_stopnet_loss: 0.60959 (-0.00727)
| > avg_loss: 8.80569 (-0.06218)
| > avg_align_error: 0.93639 (+0.00008)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_342.pth
> EPOCH: 171/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 08:16:48)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 21.50638 (21.50638)
| > postnet_loss: 11.04807 (11.04807)
| > stopnet_loss: 0.60233 (0.60233)
| > loss: 8.74094 (8.74094)
| > align_error: 0.93648 (0.93648)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.67342 (+0.14149)
| > avg_decoder_loss: 21.50638 (-0.14750)
| > avg_postnet_loss: 11.04807 (-0.08244)
| > avg_stopnet_loss: 0.60233 (-0.00727)
| > avg_loss: 8.74094 (-0.06475)
| > avg_align_error: 0.93648 (+0.00009)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_344.pth
> EPOCH: 172/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 08:26:54)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 21.35288 (21.35288)
| > postnet_loss: 10.95676 (10.95676)
| > stopnet_loss: 0.59487 (0.59487)
| > loss: 8.67228 (8.67228)
| > align_error: 0.93656 (0.93656)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.48456 (-0.18886)
| > avg_decoder_loss: 21.35288 (-0.15350)
| > avg_postnet_loss: 10.95676 (-0.09131)
| > avg_stopnet_loss: 0.59487 (-0.00745)
| > avg_loss: 8.67228 (-0.06866)
| > avg_align_error: 0.93656 (+0.00009)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_346.pth
> EPOCH: 173/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 08:37:08)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 21.19357 (21.19357)
| > postnet_loss: 10.86511 (10.86511)
| > stopnet_loss: 0.58717 (0.58717)
| > loss: 8.60184 (8.60184)
| > align_error: 0.93665 (0.93665)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.63213 (+0.14757)
| > avg_decoder_loss: 21.19357 (-0.15931)
| > avg_postnet_loss: 10.86511 (-0.09165)
| > avg_stopnet_loss: 0.58717 (-0.00770)
| > avg_loss: 8.60184 (-0.07044)
| > avg_align_error: 0.93665 (+0.00009)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_348.pth
> EPOCH: 174/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 08:47:05)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 21.02887 (21.02887)
| > postnet_loss: 10.77111 (10.77111)
| > stopnet_loss: 0.57934 (0.57934)
| > loss: 8.52934 (8.52934)
| > align_error: 0.93675 (0.93675)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
warning: audio amplitude out of range, auto clipped.
--> EVAL PERFORMANCE
| > avg_loader_time: 1.70113 (+0.06900)
| > avg_decoder_loss: 21.02887 (-0.16470)
| > avg_postnet_loss: 10.77111 (-0.09400)
| > avg_stopnet_loss: 0.57934 (-0.00783)
| > avg_loss: 8.52934 (-0.07250)
| > avg_align_error: 0.93675 (+0.00009)
> BEST MODEL : output/run-May-12-2023_06+01AM-0429ab9/best_model_350.pth
> EPOCH: 175/1000
--> output/run-May-12-2023_06+01AM-0429ab9
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 99
| > Preprocessing samples
| > Max text length: 145
| > Min text length: 4
| > Avg text length: 46.82828282828283
|
| > Max audio length: 217876.0
| > Min audio length: 9063.0
| > Avg audio length: 47580.818181818184
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> TRAINING (2023-05-12 08:57:08)
--> STEP: 0/2 -- GLOBAL_STEP: 350
| > decoder_loss: 25.03572 (25.03572)
| > postnet_loss: 18.96512 (18.96512)
| > stopnet_loss: 0.60170 (0.60170)
| > loss: 11.60191 (11.60191)
| > align_error: 0.95833 (0.95833)
| > grad_norm: 5.52891 (5.52891)
| > current_lr: 0.00000
| > step_time: 0.75420 (0.75422)
| > loader_time: 81.17010 (81.17014)
> DataLoader initialization
| > Tokenizer:
| > add_blank: False
| > use_eos_bos: False
| > use_phonemes: True
| > phonemizer:
| > phoneme language: en-us
| > phoneme backend: gruut
| > Number of instances : 1
| > Preprocessing samples
| > Max text length: 52
| > Min text length: 52
| > Avg text length: 52.0
|
| > Max audio length: 39712.0
| > Min audio length: 39712.0
| > Avg audio length: 39712.0
| > Num. instances discarded samples: 0
| > Batch group size: 0.
> EVALUATION
--> STEP: 0
| > decoder_loss: 20.85666 (20.85666)
| > postnet_loss: 10.67100 (10.67100)
| > stopnet_loss: 0.57113 (0.57113)
| > loss: 8.45304 (8.45304)
| > align_error: 0.93684 (0.93684)
warning: audio amplitude out of range, auto clipped.
| > Synthesizing test sentences.
> Decoder stopped with `max_decoder_steps` 10000
> Decoder stopped with `max_decoder_steps` 10000