[Qualcomm] Support native_layer_norm and affine-free LayerNorm in QNN backend#18990
[Qualcomm] Support native_layer_norm and affine-free LayerNorm in QNN backend#18990KevinUW114514 wants to merge 4 commits intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18990
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Hi @KevinUW114514! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
|
@pytorchbot label "release notes: none" |
|
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
There was a problem hiding this comment.
Pull request overview
Fixes a crash in the Qualcomm QNN PT2E quantizer by making _mark_nodes_as_annotated robust to None entries in node lists (e.g., when aten.layer_norm has optional affine args like weight=None).
Changes:
- Skip
Noneentries in_mark_nodes_as_annotatedto avoidAttributeErrorwhen accessingnode.meta.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -29,6 +29,8 @@ | |||
|
|
|||
| def _mark_nodes_as_annotated(nodes: List[Node]): | |||
| if node is None: | ||
| continue |
|
Hi @KevinUW114514 thank you for your contribution. I think the root cause is that we need to guard weight and bias creation in rules files for htp and lpai similar to #18219, let me know if you are willing to change it. Adding the guard might silently propagate bad configs like these in the pipeline and i think we should fail loudly. CC: @shewu-quic |
|
Hi @abhinaykukkadapu , thanks for the follow-up! Actually I also realized this root issue as I encountered the error in my downstream tasks. I am currently working on fixing this. I can edit the issue and PR to re-state the issue and submit a complete fix for it. Let me know if any concern. Thank you! |
Thanks, that would be awesome, will look forward to your changes. |
Fixes AttributeError when aten.native_layer_norm has optional weight=None. Both weight and bias are guarded to handle the None case gracefully. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… backend - add QNN layer norm support for aten.native_layer_norm.default - handle missing weight/bias by creating identity weight and zero bias - always provide bias tensor for QNN LayerNorm op - add floating-point and quantized tests for native_layer_norm - print generated pte filename after export
|
Hi @abhinaykukkadapu and @shewu-quic , thanks for your help before! Could you please take a look at the PR to see whether I am doing the right fix on this? I appreciate your time 😃 |
|
|
||
| def _mark_nodes_as_annotated(nodes: List[Node]): | ||
| for node in nodes: | ||
| if node is None: |
There was a problem hiding this comment.
We might want to get rid of this, CC: @shewu-quic
There was a problem hiding this comment.
I think the node should not be None in this function.
|
@KevinUW114514 LGTM, will wait for a stamp from @shewu-quic too |
There was a problem hiding this comment.
Pull request overview
This PR adds Qualcomm QNN backend support for aten.native_layer_norm.default (the decomposed form of torch.nn.LayerNorm) and improves robustness when optional weight/bias inputs are None (e.g., elementwise_affine=False).
Changes:
- Update the QNN op builder to target
aten.native_layer_norm.defaultand synthesize identityweight/ zerobiaswhen missing. - Make quantizer annotation/marking logic resilient to optional
Nonenodes and register the HTP annotator for bothlayer_normandnative_layer_norm. - Add new test model + delegate tests intended to cover native layer norm (float + quantized).
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| backends/qualcomm/builders/op_layer_norm.py | Switch visitor target to native_layer_norm and add synthetic weight/bias handling for optional inputs. |
| backends/qualcomm/builders/utils.py | Allow get_parameter() to safely handle None inputs. |
| backends/qualcomm/quantizer/rules.py | Skip None entries when marking nodes as annotated. |
| backends/qualcomm/quantizer/annotators/htp_rules.py | Register LayerNorm annotator for both layer_norm and native_layer_norm; avoid annotating missing optional args. |
| backends/qualcomm/quantizer/annotators/lpai_rules.py | Make LayerNorm annotator tolerant of missing optional args (but still only registered for layer_norm). |
| backends/qualcomm/tests/models.py | Add NativeLayerNorm test module. |
| backends/qualcomm/tests/test_qnn_delegate.py | Add float + quantized tests for NativeLayerNorm. |
| backends/qualcomm/export_utils.py | Print a success message after writing the generated .pte. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| self.weight = torch.nn.Parameter(torch.ones(768)) | ||
| self.bias = torch.nn.Parameter(torch.zeros(768)) | ||
| self.normalized_shape = [768] | ||
| self.eps = 1e-6 | ||
|
|
||
| def forward(self, x): | ||
| if self.affine: | ||
| return torch.native_layer_norm( | ||
| x, self.normalized_shape, self.weight, self.bias, self.eps | ||
| )[0] | ||
| else: | ||
| return torch.native_layer_norm( | ||
| x, self.normalized_shape, self.weight, self.bias, self.eps | ||
| )[0] |
| for i, module in enumerate(modules): | ||
| with self.subTest(i=i): | ||
| self.lower_module_and_test_output(module, sample_input) | ||
|
|
| @@ -29,7 +29,9 @@ def is_parameter( | |||
|
|
|||
| def get_parameter( | |||
| node: torch.fx.Node, edge_program: torch.export.ExportedProgram | |||
| @staticmethod | ||
| def annotate(node: Node, quantization_config: QuantizationConfig) -> None: | ||
| act_node = node.args[0] | ||
| weight_node = node.args[2] | ||
| bias_node = None | ||
| if len(node.args) > 2: | ||
| bias_node = node.args[3] | ||
| weight_node = node.args[2] if len(node.args) > 2 else None | ||
| bias_node = node.args[3] if len(node.args) > 3 else None |
shewu-quic
left a comment
There was a problem hiding this comment.
Thank you for your effort.
| self.eps = 1e-6 | ||
|
|
||
| def forward(self, x): | ||
| if self.affine: |
There was a problem hiding this comment.
These two branches seem to be the same. Would it be possible to extend the current LayerNorm with torch.nn.LayerNorm(elementwise_affine=False) as a test case?
|
|
||
| bias_node = self.get_node(node.args[3]) | ||
| if bias_node is not None: | ||
| # Fake node: even when original bias is absent, QNN still needs it |
There was a problem hiding this comment.
I think the bias is optional for QNN and can be kept as in the original design.
https://docs.qualcomm.com/doc/80-63442-10/topic/MasterOpDef.html#layernorm
| def get_parameter( | ||
| node: torch.fx.Node, edge_program: torch.export.ExportedProgram | ||
| ) -> torch.Tensor: | ||
| ) -> Optional[torch.Tensor]: |
There was a problem hiding this comment.
This function shouldn't return None. Perhaps we should ensure that the node is not None before this function is called.
|
|
||
| def _mark_nodes_as_annotated(nodes: List[Node]): | ||
| for node in nodes: | ||
| if node is None: |
There was a problem hiding this comment.
I think the node should not be None in this function.
[Qualcomm] Support native_layer_norm and affine-free LayerNorm in QNN backend
Summary
Adds QNN backend support for
aten.native_layer_norm.default(which is the decomposed form oftorch.nn.LayerNorm) and handles models where weight/bias are not provided (elementwise_affine=False).Problem
When exporting models with
torch.native_layer_normortorch.nn.LayerNorm(affine=False)to the QNN backend, the following issues occur:Missing
native_layer_normvisitor: The originalLayerNormVisitoronly targetsaten.layer_norm.default, but PyTorch decomposestorch.nn.LayerNormtoaten.native_layer_norm.defaultduring export.None weight/bias: When
elementwise_affine=False, the weight and bias arguments areNone. QNN x86_64 runtime cannot handleNonetensor inputs, causingAttributeErrorwhen callingget_parameter().Solution
1. Update visitor target (
op_layer_norm.py)Change the visitor target from
aten.layer_norm.defaulttoaten.native_layer_norm.default:This is correct because during ExecuTorch export,
aten.layer_norm.defaultis decomposed toaten.native_layer_norm.defaultbefore the QNN lowering stage.2. Handle None weight/bias (
op_layer_norm.py)When weight/bias are
None, create synthetic tensors:torch.ones(normalized_shapes)(identity transform)torch.zeros(normalized_shapes)(no offset)Create synthetic
fx.Nodeobjects to register these as QNN static tensors:3. Use same annotator for both ops (
htp_rules.py)The quantizer annotator registers both
aten.layer_norm.defaultandaten.native_layer_norm.defaultto the sameLayerNormclass, since both ops have identical argument schemas:4. Add None check to
get_parameter()(utils.py)Guard against
Nonenodes to preventAttributeError:Files Changed
builders/op_layer_norm.pynative_layer_normsupport + handle None weight/biasbuilders/utils.pyget_parameter()quantizer/annotators/htp_rules.pytests/models.pyNativeLayerNormtest modeltests/test_qnn_delegate.pyTest Plan
Run QNN delegate tests for layer_norm:
python backends/qualcomm/tests/test_qnn_delegate.py \ -k "test_qnn_backend_layer_norm or test_qnn_backend_native_layer_norm" \ --soc_model SM8650 \ --build_folder build-x86/ \ --executorch_root . \ --enable_x86_64Expected: 4 tests pass (2 floating-point, 2 quantized).
Release Notes
Release notes: qualcommRelated Issues
This resolves the issue where FLUX2 transformer export fails with:
[QNN Delegate Op Builder]: LayerNorm weight is None, skippingAttributeError: 'NoneType' object has no attribute 'name'Fixes #18989
@abhinaykukkadapu