Commit 
							
							·
						
						8700066
	
1
								Parent(s):
							
							cecbfb5
								
figure
Browse files
    	
        assets/images/profile_trace_annotated.png
    ADDED
    
    |   | 
| Git LFS Details
 | 
    	
        dist/assets/images/profile_trace_annotated.png
    ADDED
    
    |   | 
| Git LFS Details
 | 
    	
        dist/index.html
    CHANGED
    
    | @@ -534,7 +534,7 @@ | |
| 534 | 
             
                        <li>Kernel execution times and memory allocation</li>
         | 
| 535 | 
             
                    </ul>
         | 
| 536 |  | 
| 537 | 
            -
                    <p><img alt=" | 
| 538 | 
             
                    <p>Figure: Example trace showing CPU thread launching kernels asynchronously to GPU, with compute kernels and communication happening in parallel across different CUDA streams</p>
         | 
| 539 |  | 
| 540 | 
             
                    <p>The trace helps identify bottlenecks like:</p>
         | 
|  | |
| 534 | 
             
                        <li>Kernel execution times and memory allocation</li>
         | 
| 535 | 
             
                    </ul>
         | 
| 536 |  | 
| 537 | 
            +
                    <p><img alt="profile_trace_annotated.png" src="/assets/images/profile_trace_annotated.png" /></p>
         | 
| 538 | 
             
                    <p>Figure: Example trace showing CPU thread launching kernels asynchronously to GPU, with compute kernels and communication happening in parallel across different CUDA streams</p>
         | 
| 539 |  | 
| 540 | 
             
                    <p>The trace helps identify bottlenecks like:</p>
         | 
    	
        src/index.html
    CHANGED
    
    | @@ -534,7 +534,7 @@ | |
| 534 | 
             
                        <li>Kernel execution times and memory allocation</li>
         | 
| 535 | 
             
                    </ul>
         | 
| 536 |  | 
| 537 | 
            -
                    <p><img alt=" | 
| 538 | 
             
                    <p>Figure: Example trace showing CPU thread launching kernels asynchronously to GPU, with compute kernels and communication happening in parallel across different CUDA streams</p>
         | 
| 539 |  | 
| 540 | 
             
                    <p>The trace helps identify bottlenecks like:</p>
         | 
|  | |
| 534 | 
             
                        <li>Kernel execution times and memory allocation</li>
         | 
| 535 | 
             
                    </ul>
         | 
| 536 |  | 
| 537 | 
            +
                    <p><img alt="profile_trace_annotated.png" src="/assets/images/profile_trace_annotated.png" /></p>
         | 
| 538 | 
             
                    <p>Figure: Example trace showing CPU thread launching kernels asynchronously to GPU, with compute kernels and communication happening in parallel across different CUDA streams</p>
         | 
| 539 |  | 
| 540 | 
             
                    <p>The trace helps identify bottlenecks like:</p>
         | 

