Skip to content

Merge upstream, releases/gcc-13 branch point#2130

Merged
tschwinge merged 95 commits into
masterfrom
tschwinge/merge-upstream
Apr 18, 2023
Merged

Merge upstream, releases/gcc-13 branch point#2130
tschwinge merged 95 commits into
masterfrom
tschwinge/merge-upstream

Conversation

@tschwinge

Copy link
Copy Markdown
Member

This must of course not be rebased by GitHub merge queue, but has to become a proper Git merge. (I'll handle that, once ready.)

andrewwmacleod and others added 30 commits April 6, 2023 08:32
When the IL is rewritten after a statement has been processed and
dependencies cached, its possible that an ssa-name in the dependency
cache is no longer in the IL.  Check this before trying to recompute.

	PR tree-optimization/109417
	gcc/
	* gimple-range-gori.cc (gori_compute::may_recompute_p): Check if
	dependency is in SSA_NAME_FREE_LIST.

	gcc/testsuite/
	* gcc.dg/pr109417.c: New.
My change r13-416-g485a0ae0982abe caused the compiler to stop
generating auto-inc operations on mve loads and stores.  The fix
is to check whether there is a replacement register available
when in strict mode and the register is still a pseudo.

gcc:

	PR target/107674
	* config/arm/arm.cc (arm_effective_regno): New function.
	(mve_vector_mem_operand): Use it.
This is just a minor issue I found with a previous test
of mine that caused it to fail in C++ mode due to these
unused const variables being uninitialised. I forgot to
remove these after removing some test cases that did use
them. I removed the test cases, because I came to the
conclusion that the const-ness of the immediate was
irrelevant to the test itself.
Removing the variables now makes the test PASS for C++.

gcc/testsuite/ChangeLog:

	* gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-fp.c: Remove unused variables.
	* gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c: Remove unused variables.
Skip ppc-fortran.exp if a trivial fortran program cannot be compiled.


for  gcc/testsuite/ChangeLog

	* gcc.target/powerpc/ppc-fortran/ppc-fortran.exp: Test for
	fortran compiler, skip if missing.
Backport CL 421442 from upstream.

Original description:

Arrange for tests that call setMimeInit to fully restore the old values,
by clearing the sync.Once that controls initialization.

Once we've done that, call initMime in initMimeUnixTest because
otherwise the test types loaded there will be cleared by the call to
initMime that previously was not being done.

For golang/go#51648

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/483117
2023-04-08   Paul Thomas  <pault@gcc.gnu.org>

	* gfortran.dg/c-interop/allocatable-optional-pointer.f90 : Fix
	dg directive and remove trailing whitespace.
	* gfortran.dg/c-interop/c407a-1.f90 : ditto
	* gfortran.dg/c-interop/c407b-1.f90 : ditto
	* gfortran.dg/c-interop/c407b-2.f90 : ditto
	* gfortran.dg/c-interop/c407c-1.f90 : ditto
	* gfortran.dg/c-interop/c535a-1.f90 : ditto
	* gfortran.dg/c-interop/c535a-2.f90 : ditto
	* gfortran.dg/c-interop/c535b-1.f90 : ditto
	* gfortran.dg/c-interop/c535b-2.f90 : ditto
	* gfortran.dg/c-interop/c535b-3.f90 : ditto
	* gfortran.dg/c-interop/c535c-1.f90 : ditto
	* gfortran.dg/c-interop/c535c-2.f90 : ditto
	* gfortran.dg/c-interop/deferred-character-1.f90 : ditto
	* gfortran.dg/c-interop/removed-restrictions-1.f90 : ditto
	* gfortran.dg/c-interop/removed-restrictions-2.f90 : ditto
	* gfortran.dg/c-interop/removed-restrictions-4.f90 : ditto
	* gfortran.dg/c-interop/tkr.f90 : ditto
	* gfortran.dg/class_result_10.f90 : ditto
	* gfortran.dg/dtio_35.f90 : ditto
	* gfortran.dg/gomp/affinity-clause-1.f90 : ditto
	* gfortran.dg/pr103258.f90 : ditto
	* gfortran.dg/pr59107.f90 : ditto
	* gfortran.dg/pr93835.f08 : ditto
2023-04-08  Paul Thomas  <pault@gcc.gnu.org>

gcc/fortran
	PR fortran/87477
	* iresolve.cc (gfc_resolve_adjustl, gfc_resolve_adjustr): if
	string length is deferred use the string typespec for result.
	* resolve.cc (resolve_assoc_var): Handle parentheses around the
	target expression.
	(resolve_block_construct): Remove unnecessary static decls.
	* trans-array.cc (gfc_conv_expr_descriptor): Guard string len
	expression in condition. Improve handling of string length and
	span, especially for substrings of the descriptor.
	(duplicate_allocatable): Make element type more explicit with
	'eltype'.
	* trans-decl.cc (gfc_get_symbol_decl): Emit a fatal error with
	appropriate message instead of ICE if symbol type is unknown.
	(gfc_generate_function_code): Set current locus to proc_sym
	declared_at.
	* trans-expr.cc (gfc_get_expr_charlen): Retain last charlen in
	'previous' and use if end expression in substring reference is
	null.
	(gfc_conv_string_length): Use gfc_conv_expr_descriptor if
	'expr_flat' is an array. Add post block to catch deallocation
	of temporaries.
	(gfc_conv_procedure_call): Assign the parmse string length to
	the expression string length, if it is deferred.
	(gfc_trans_alloc_subarray_assign): If this is a deferred string
	length component, store the string length in the hidden comp.
	Update the typespec length accordingly. Generate a new type
	spec for the call to gfc_duplicate-allocatable in this case.
	* trans-io.cc (gfc_trans_transfer): Scalarize transfer of
	deferred character array components.

gcc/testsuite/
	PR fortran/87477
	* gfortran.dg/associate_47.f90 : Enable substring test.
	* gfortran.dg/associate_51.f90 : Update an error message.
	* gfortran.dg/goacc/array-with-dt-2.f90 : Add span to
	uninitialzed dg-warnings.

	PR fortran/85686
	PR fortran/88247
	PR fortran/91941
	PR fortran/92779
	PR fortran/93339
	PR fortran/93813
	PR fortran/100948
	PR fortran/102106
	* gfortran.dg/associate_60.f90 : New test

	PR fortran/98408
	* gfortran.dg/pr98408.f90 : New test

	PR fortran/105205
	* gfortran.dg/pr105205.f90 : New test

	PR fortran/106918
	* gfortran.dg/pr106918.f90 : New test
I've noticed
make: Circular build/genrvv-type-indexer.o <- gtype-desc.h dependency dropped.

The following patch fixes that.  The RTL_BASE_H variable includes a lot of
headers which the generator doesn't include, including gtype-desc.h.
I've preprocessed it and checked all gcc/libiberty headers against what is
included in the other dependency variables and here is what I found:
1) coretypes.h includes align.h, poly-int.h and poly-int-types.h which
   weren't listed (most of dependencies are thankfully done automatically,
   so it isn't that big deal except for these generators and the like)
2) system.h includes filenames.h (already listed) but filenames.h includes
   hashtab.h; instead of adding FILENAMES_H I've just added the dependency
   to SYSTEM_H
3) $(RTL_BASE_H) wasn't really needed at all and insn-modes.h is already
   included in $(CORETYPES_H)

2023-04-08  Jakub Jelinek  <jakub@redhat.com>

	* Makefile.in (CORETYPES_H): Depend on align.h, poly-int.h and
	poly-int-types.h.
	(SYSTEM_H): Depend on $(HASHTAB_H).
	* config/riscv/t-riscv (build/genrvv-type-indexer.o): Remove unused
	dependency on $(RTL_BASE_H), remove redundant dependency on
	insn-modes.h.
…reversed direction [PR109402]

muldi3 will deallocate stack space after the call to __save_r26_r31,
then re-allocate the space a short while later.  If an interrupt
occurs in that window, it can clobber items on the stack.

	PR target/109402

libgcc/
	* config/v850/lib1funcs.S (___muldi3): Remove unnecessary
	stack manipulations.
2023-04-08  John David Anglin  <danglin@gcc.gnu.org>

gcc/testsuite/ChangeLog:

	* gcc.dg/long_branch.c: Use timeout factor 2.0 on hppa*-*-*.
2023-04-08  John David Anglin  <danglin@gcc.gnu.org>

gcc/testsuite/ChangeLog:

	* gcc.dg/pr84877.c: xfail on hppa*-*-*.
If we have an object with SSA_NAME_OCCURS_IN_ABNORMAL_PHI, then
maybe_push_res_to_seq may fail.  Directly build the extraction
for that case.

	PR tree-optimization/109392

gcc/
	* tree-vect-generic.cc (tree_vec_extract): Handle failure
	of maybe_push_res_to_seq better.

gcc/testsuite/

	* gcc.dg/pr109392.c: New test.
…ind.

When the function contains no local vars and also no nested scopes, there
is no top-level bind expression.  Because the rewritten coroutine body will
require both local vars and contain nested scopes, we add a bind expression
to such functions.  When this was done the necessary scope blocks were
omitted which leads to disconnected function content.

Fixed by adding a new block to the added bind expression.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

gcc/cp/ChangeLog:

	* coroutines.cc (coro_rewrite_function_body): Ensure that added
	bind expressions have scope blocks.
gcc/ChangeLog:

	* common/config/i386/cpuinfo.h (get_available_features):
	Detect AMX-COMPLEX.
	* common/config/i386/i386-common.cc
	(OPTION_MASK_ISA2_AMX_COMPLEX_SET,
	OPTION_MASK_ISA2_AMX_COMPLEX_UNSET): New.
	(ix86_handle_option): Handle -mamx-complex.
	* common/config/i386/i386-cpuinfo.h (enum processor_features):
	Add FEATURE_AMX_COMPLEX.
	* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
	amx-complex.
	* config.gcc: Add amxcomplexintrin.h.
	* config/i386/cpuid.h (bit_AMX_COMPLEX): New.
	* config/i386/i386-c.cc (ix86_target_macros_internal): Define
	__AMX_COMPLEX__.
	* config/i386/i386-isa.def (AMX_COMPLEX): Add DEF_PTA(AMX_COMPLEX).
	* config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p):
	Handle amx-complex.
	* config/i386/i386.opt: Add option -mamx-complex.
	* config/i386/immintrin.h: Include amxcomplexintrin.h.
	* doc/extend.texi: Document amx-complex.
	* doc/invoke.texi: Document -mamx-complex.
	* doc/sourcebuild.texi: Document target amx-complex.
	* config/i386/amxcomplexintrin.h: New file.

gcc/testsuite/ChangeLog:

	* g++.dg/other/i386-2.C: Add -mamx-complex.
	* g++.dg/other/i386-3.C: Ditto.
	* gcc.target/i386/amx-check.h: Add cpu check for AMX-COMPLEX.
	* gcc.target/i386/amx-helper.h: Add amx-complex support.
	* gcc.target/i386/funcspec-56.inc: Add new target attribute.
	* gcc.target/i386/sse-12.c: Add -mamx-complex.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-14.c: Ditto.
	* gcc.target/i386/sse-22.c: Add amx-complex.
	* gcc.target/i386/sse-23.c: Ditto.
	* lib/target-supports.exp (check_effective_target_amx_complex): New.
	* gcc.target/i386/amxcomplex-asmatt-1.c: New test.
	* gcc.target/i386/amxcomplex-asmintel-1.c: Ditto.
	* gcc.target/i386/amxcomplex-cmmimfp16ps-2.c: Ditto.
	* gcc.target/i386/amxcomplex-cmmrlfp16ps-2.c: Ditto.
gcc/Changelog:

	* config/i386/i386.h (PTA_GRANITERAPIDS): Add PTA_AMX_COMPLEX.
This is version 3 of the patch.  This is essentially version 1 with the removal
of changes to altivec.md, and cleanup of the comments.

Version 2 generated the vmaddfp and vnmsubfp instructions if -Ofast was used,
and those changes are deleted in this patch.

The Altivec instructions vmaddfp and vnmsubfp have different rounding behaviors
than the VSX xvmaddsp and xvnmsubsp instructions due to VSCR[NJ] and other
corner cases.  In particular, generating these instructions seems to break
Eigen on big endian systems.

2023-04-09   Michael Meissner  <meissner@linux.ibm.com>

gcc/

	PR target/70243
	* config/rs6000/vsx.md (vsx_fmav4sf4): Do not generate vmaddfp.
	(vsx_nfmsv4sf4): Do not generate vnmsubfp.

gcc/testsuite/

	PR target/70243
	* gcc.target/powerpc/pr70243.c: New test.
gcc/
	PR target/108812
	* config/rs6000/vsx.md (vsx_sign_extend_qi_<mode>): Rename to...
	(vsx_sign_extend_v16qi_<mode>): ... this.
	(vsx_sign_extend_hi_<mode>): Rename to...
	(vsx_sign_extend_v8hi_<mode>): ... this.
	(vsx_sign_extend_si_v2di): Rename to...
	(vsx_sign_extend_v4si_v2di): ... this.
	(vsignextend_qi_<mode>): Remove.
	(vsignextend_hi_<mode>): Remove.
	(vsignextend_si_v2di): Remove.
	(vsignextend_v2di_v1ti): Remove.
	(*xxspltib_<mode>_split): Replace gen_vsx_sign_extend_qi_v2di with
	gen_vsx_sign_extend_v16qi_v2di and gen_vsx_sign_extend_qi_v4si
	with gen_vsx_sign_extend_v16qi_v4si.
	* config/rs6000/rs6000.md (split for DI constant generation):
	Replace gen_vsx_sign_extend_qi_si with gen_vsx_sign_extend_v16qi_si.
	(split for HSDI constant generation): Replace gen_vsx_sign_extend_qi_di
	with gen_vsx_sign_extend_v16qi_di and gen_vsx_sign_extend_qi_si
	with gen_vsx_sign_extend_v16qi_si.
	* config/rs6000/rs6000-builtins.def (__builtin_altivec_vsignextsb2d):
	Set bif-pattern to vsx_sign_extend_v16qi_v2di.
	(__builtin_altivec_vsignextsb2w): Set bif-pattern to
	vsx_sign_extend_v16qi_v4si.
	(__builtin_altivec_visgnextsh2d): Set bif-pattern to
	vsx_sign_extend_v8hi_v2di.
	(__builtin_altivec_vsignextsh2w): Set bif-pattern to
	vsx_sign_extend_v8hi_v4si.
	(__builtin_altivec_vsignextsw2d): Set bif-pattern to
	vsx_sign_extend_si_v2di.
	(__builtin_altivec_vsignext): Set bif-pattern to
	vsx_sign_extend_v2di_v1ti.
	* config/rs6000/rs6000-builtin.cc (lxvrse_expand_builtin): Replace
	gen_vsx_sign_extend_qi_v2di with gen_vsx_sign_extend_v16qi_v2di,
	gen_vsx_sign_extend_hi_v2di with gen_vsx_sign_extend_v8hi_v2di and
	gen_vsx_sign_extend_si_v2di with gen_vsx_sign_extend_v4si_v2di.

gcc/testsuite/
	PR target/108812
	* gcc.target/powerpc/p9-sign_extend-runnable.c: Set corresponding
	expected vectors for Big Endian.
	* gcc.target/powerpc/int_128bit-runnable.c: Likewise.
The original patch to fix this PR broke the if-conversion of calls into
IFN_MASK_CALL.  This patch restores that original behaviour and makes sure the
tests added earlier specifically test inbranch SIMD clones.

gcc/ChangeLog:

	PR tree-optimization/108888
	* tree-if-conv.cc (predicate_statements): Fix gimple call check.

gcc/testsuite/ChangeLog:

	* gcc.dg/vect/vect-simd-clone-16.c: Make simd clone inbranch only.
	* gcc.dg/vect/vect-simd-clone-17.c: Likewise.
	* gcc.dg/vect/vect-simd-clone-18.c: Likewise.
The revision r13-259-g76db543db88727 moved a condition from one
file to another, but now we do not drop x_flag_var_tracking_assignments
as it was done before the mentioned revision.

	PR driver/108241

gcc/ChangeLog:

	* opts.cc (finish_options): Drop also
	x_flag_var_tracking_assignments.

gcc/testsuite/ChangeLog:

	* gcc.dg/pr108241.c: New test.
	* gcc.dg/pr79570.c: Add also -g option.
Commit r13-7120-g46fe32cb4d887d44a62f9c4ff2a72532d4eb5a19 added the
missing hyphen to 'dg-final', which exposed an -m32 pattern mismatch.

gcc/testsuite/

	* gfortran.dg/gomp/affinity-clause-1.f90: Update scan-tree pattern
	for -m32.
This patch registers a riscv specific function to
TARGET_ZERO_CALL_USED_REGS instead of default in targhooks.cc. It will
clean gpr and vector relevant registers.

gcc/ChangeLog:

	PR target/109104
	* config/riscv/riscv-protos.h (emit_hard_vlmax_vsetvl): New.
	* config/riscv/riscv-v.cc (emit_hard_vlmax_vsetvl): New.
	(emit_vlmax_vsetvl): Use emit_hard_vlmax_vsetvl.
	* config/riscv/riscv.cc (vector_zero_call_used_regs): New.
	(riscv_zero_call_used_regs): New.
	(TARGET_ZERO_CALL_USED_REGS): New.

gcc/testsuite/ChangeLog:

	PR target/109104
	* gcc.target/riscv/zero-scratch-regs-1.c: New test.
	* gcc.target/riscv/zero-scratch-regs-2.c: New test.
	* gcc.target/riscv/zero-scratch-regs-3.c: New test.

Signed-off-by: Yanzhang Wang <yanzhang.wang@intel.com>
Co-authored-by: Pan Li <pan2.li@intel.com>
Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
Co-authored-by: Kito Cheng <kito.cheng@sifive.com>
…pattern

there is no need to split an xori/ori with an small constant. take the test
case `int foo(int idx) { return idx|3; }` as an example,

rv64im_zba generates:
        ori     a0,a0,3
        ret
but, rv64im_zba_zbs generates:
        ori     a0,a0,1
        ori     a0,a0,2
        ret

with this change, insn `ori r2,r1,3` will not be splitted in zbs.

gcc/
	* config/riscv/predicates.md (uimm_extra_bit_or_twobits): Adjust
	predicate to avoid splitting arith constants.

gcc/testsuite

	* gcc.target/riscv/zbs-extra-bit-or-twobits.c: New test.
	* sv.po: Update.
The test case gcc.target/powerpc/pr83677.c was written for
LE environment, this patch is to make it work on BE as well.

	PR testsuite/108815

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/pr83677.c (v_expand_u8, v_expand_u16,
	v_load_deinterleave_f32, v_store_interleave_f32): Adjust some code by
	considering BE.
rguenth and others added 29 commits April 14, 2023 11:43
This replaces i686*-*-* && { ! lp64 } with the appropriate
{ i?86-*-* x86_64-*-* } && { ! lp64 } for the testcases and
also amends the e variants checking last variant for avx.
I've used avx in the dump scanning, not avx_runtime, since
the dumps get produced when one would not execute but only
compile them.  The f varaints lack AVX checking, I didn't
rectify this with this patch.

	* gcc.dg/vect/vect-simd-clone-16e.c: Fix x86 lp64 checking
	and add missing avx guard.
	* gcc.dg/vect/vect-simd-clone-17e.c: Likewise.
	* gcc.dg/vect/vect-simd-clone-18e.c: Likewise.
	* gcc.dg/vect/vect-simd-clone-16f.c: Fix x86 lp64 checking.
	* gcc.dg/vect/vect-simd-clone-17f.c: Likewise.
	* gcc.dg/vect/vect-simd-clone-18f.c: Likewise.
The following fixes a check that should have rejected vectorizing
a conversion between a mask and non-mask type.  Those should be
done via pattern statements.

	PR tree-optimization/109502
	* tree-vect-stmts.cc (vectorizable_assignment): Fix
	check for conversion between mask and non-mask types.

	* gcc.dg/vect/pr109502.c: New testcase.
2023-04-14  Paul Thomas  <pault@gcc.gnu.org>

gcc/fortran
	PR fortran/104272
	* gfortran.h : Add expr3_not_explicit bit field to gfc_code.
	* resolve.cc (resolve_allocate_expr): Set bit field when the
	default initializer is applied to expr3.
	* trans-stmt.cc (gfc_trans_allocate): If expr3_not_explicit is
	set, do not deallocate expr3.

gcc/testsuite/
	PR fortran/104272
	* gfortran.dg/class_result_8.f90 : Number of builtin_frees down
	from 6 to 5 without memory leaks.
	* gfortran.dg/finalize_52.f90: New test
Add a static_assert and a comment so that calling std::format for
unformattable argument types will now show:

/home/jwakely/gcc/13/include/c++/13.0.1/format:3563:22: error: static assertion failed: std::formatter must be specialized for each format arg
 3563 |       static_assert((is_default_constructible_v<formatter<_Args, _CharT>> && ...),
      |                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

and:

  140 |       formatter() = delete; // No std::formatter specialization for this type.

libstdc++-v3/ChangeLog:

	* include/std/format (formatter): Add comment to deleted default
	constructor of primary template.
	(_Checking_scanner): Add static_assert.
The following reverts the s/avx_runtime/avx/ changes I've done,
they were wrong.

	* gcc.dg/vect/vect-simd-clone-16e.c: Revert back to
	checking avx_runtime in dump scanning.
	* gcc.dg/vect/vect-simd-clone-17e.c: Likewise.
	* gcc.dg/vect/vect-simd-clone-18e.c: Likewise.
libstdc++-v3/ChangeLog:

	* include/bits/ranges_algo.h: Include <optional> for C++23.
	(__cpp_lib_fold): Define for C++23.
	(in_value_result): Likewise.
	(__detail::__flipped): Likewise.
	(__detail::__indirectly_binary_left_foldable_impl): Likewise.
	(__detail::__indirectly_binary_left_foldable): Likewise.
	(___detail:__indirectly_binary_right_foldable): Likewise.
	(fold_left_with_iter_result): Likewise.
	(__fold_left_with_iter_fn, fold_left_with_iter): Likewise.
	(__fold_left_fn, fold_left): Likewise.
	(__fold_left_first_with_iter_fn, fold_left_first_with_iter):
	Likewise.
	(__fold_left_first_fn, fold_left_first): Likewise.
	(__fold_right_fn, fold_right): Likewise.
	(__fold_right_last_fn, fold_right_last): Likewise.
	* include/std/version (__cpp_lib_fold): Likewise.
	* testsuite/25_algorithms/fold_left/1.cc: New test.
	* testsuite/25_algorithms/fold_right/1.cc: New test.
This moves down the definitions of the range const-access CPOs to after
the definition of input_range in preparation for implementing P2278R4
which redefines these CPOs in a way that indirectly uses input_range.

libstdc++-v3/ChangeLog:

	* include/bits/ranges_base.h (__cust_access::__as_const)
	(__cust_access::_CBegin, __cust::cbegin)
	(__cust_access::_CEnd, __cust::cend)
	(__cust_access::_CRBegin, __cust::crbegin)
	(__cust_access::_CREnd, __cust::crend)
	(__cust_access::_CData, __cust::cdata): Move down definitions to
	shortly after the definition of input_range.
…iterator"

This also implements the approved follow-up LWG issues 3765, 3766, 3769,
3770, 3811, 3850, 3853, 3862 and 3872.

libstdc++-v3/ChangeLog:

	* include/bits/ranges_base.h (const_iterator_t): Define for C++23.
	(const_sentinel_t): Likewise.
	(range_const_reference_t): Likewise.
	(constant_range): Likewise.
	(__cust_access::__possibly_const_range): Likewise, replacing ...
	(__cust_access::__as_const): ... this.
	(__cust_access::_CBegin::operator()): Redefine for C++23 as per P2278R4.
	(__cust_access::_CEnd::operator()): Likewise.
	(__cust_access::_CRBegin::operator()): Likewise.
	(__cust_access::_CREnd::operator()): Likewise.
	(__cust_access::_CData::operator()): Likewise.
	* include/bits/ranges_util.h (ranges::__detail::__different_from):
	Make it an alias of std::__detail::__different_from.
	(view_interface::cbegin): Define for C++23.
	(view_interface::cend): Likewise.
	* include/bits/stl_iterator.h (__detail::__different_from): Define.
	(iter_const_reference_t): Define for C++23.
	(__detail::__constant_iterator): Likewise.
	(__detail::__is_const_iterator): Likewise.
	(__detail::__not_a_const_iterator): Likewise.
	(__detail::__iter_const_rvalue_reference_t): Likewise.
	(__detail::__basic_const_iter_cat):: Likewise.
	(const_iterator): Likewise.
	(__detail::__const_sentinel): Likewise.
	(const_sentinel): Likewise.
	(basic_const_iterator): Likewise.
	(common_type<basic_const_iterator<_Tp>, _Up>): Likewise.
	(common_type<_Up, basic_const_iterator<_Tp>>): Likewise.
	(common_type<basic_const_iterator<_Tp>, basic_const_iterator<Up>>):
	Likewise.
	(make_const_iterator): Define for C++23.
	(make_const_sentinel): Likewise.
	* include/std/ranges (__cpp_lib_ranges_as_const): Likewise.
	(as_const_view): Likewise.
	(enable_borrowed_range<as_const_view>): Likewise.
	(views::__detail::__is_ref_view): Likewise.
	(views::__detail::__can_is_const_view): Likewise.
	(views::_AsConst, views::as_const): Likewise.
	* include/std/span (span::const_iterator): Likewise.
	(span::const_reverse_iterator): Likewise.
	(span::cbegin): Likewise.
	(span::cend): Likewise.
	(span::crbegin): Likewise.
	(span::crend): Likewise.
	* include/std/version (__cpp_lib_ranges_as_const): Likewise.
	* testsuite/std/ranges/adaptors/join.cc (test06): Adjust to
	behave independently of C++20 vs C++23.
	* testsuite/std/ranges/version_c++23.cc: Verify value of
	__cpp_lib_ranges_as_const macro.
	* testsuite/24_iterators/const_iterator/1.cc: New test.
	* testsuite/std/ranges/adaptors/as_const/1.cc: New test.
The Aarch64 back-end now asserts that the main variant of scalar types
has TYPE_USER_ALIGN cleared, and that's not the case for scalar types
declared with a confirming alignment clause in Ada.

gcc/ada/
	PR bootstrap/109510
	* gcc-interface/decl.cc (gnat_to_gnu_entity) <types>: Reset align
	to zero if its value is equal to TYPE_ALIGN and the type is scalar.
	Set TYPE_USER_ALIGN on the type only if align is positive.
gcc/fortran/ChangeLog:

	PR fortran/109511
	* simplify.cc (gfc_simplify_set_exponent): Fix implementation of
	compile-time simplification of intrinsic SET_EXPONENT for argument
	X < 1 and for I < 0.

gcc/testsuite/ChangeLog:

	PR fortran/109511
	* gfortran.dg/set_exponent_1.f90: New test.
Here we hit the MEM_REF case, with its arg an ADDR_EXPR, but had no handling
for that and wrongly assumed it would be a reference to a local variable.
This patch overhauls the logic for deciding whether the target is something
to warn about so that we only warn if we specifically recognize the target
as non-local.  None of the existing tests regress as a result.

	PR c++/109514

gcc/ChangeLog:

	* gimple-ssa-warn-access.cc (pass_waccess::check_dangling_stores):
	Overhaul lhs_ref.ref analysis.

gcc/testsuite/ChangeLog:

	* g++.dg/warn/Wdangling-pointer-6.C: New test.
When long double is 64-bit wide, as on vxworks, the rs6000 backend
defines neither the __ibm128 type nor the __SIZEOF_IBM128__ macro, but
pr99708.c expected both to be always defined.  Adjust the test to
match the implementation.


Co-Authored-By: Kewen Lin <linkw@linux.ibm.com>

for  gcc/testsuite/ChangeLog

	PR target/99708
	* gcc.target/powerpc/pr99708.c: Accept lack of
	__SIZEOF_IBM128__ when long double is 64-bit wide.
The following patch is just a dumb improvement, gets rid of 2 unnecessary
instructions on both the PR's original testcase and on the two reduced ones,
both on -mcpu=neoverse-v1 and -mavx512f.

The thing is, if we have args_len (args_len >= 2) unique PHI arguments,
we need only args_len - 1 COND_EXPRs to expand the PHI, because first
COND_EXPR can merge 2 unique arguments and all the following ones merge
another unique argument with the previously merged arguments,
while the code for mysterious reasons was always emitting args_len
COND_EXPRs, where the first COND_EXPR merged the first and second unique
arguments, the second COND_EXPR merged the second unique argument with
result of merging the first and second unique arguments and the rest was
already expectable, nth COND_EXPR for n > 2 merged the nth unique argument
with result of merging the previous unique arguments.
Now, in my understanding, the bb_predicate for bb's predecessor need to
form a disjunct set which together creates the successor's bb_predicate,
so I don't see why we'd need to check all the bb_predicates, if we check
all but one then when all those other ones are false the last bb_predicate
is necessarily true.  Given that the code attempts to sort argument with
most occurrences (so likely most complex combined predicate) last, I chose
not to test that last argument's predicate.
So e.g. on the testcase from comment 47 in the PR:
void
foo (int *f, int d, int e)
{
  for (int i = 0; i < 1024; i++)
    {
      int a = f[i];
      int t;
      if (a < 0)
        t = 1;
      else if (a < e)
        t = 1 - a * d;
      else
        t = 0;
      f[i] = t;
    }
}
we used to emit:
  _7 = a_10 < 0;
  _21 = a_10 >= 0;
  _22 = a_10 < e_11(D);
  _23 = _21 & _22;
  _26 = a_10 >= e_11(D);
  _27 = _21 & _26;
  _ifc__42 = _7 ? 1 : t_13;
  _ifc__43 = _23 ? t_13 : _ifc__42;
  t_6 = _27 ? 0 : _ifc__43;
while the following patch changes it to:
  _7 = a_10 < 0;
  _21 = a_10 >= 0;
  _22 = a_10 < e_11(D);
  _23 = _21 & _22;
  _ifc__42 = _23 ? t_13 : 0;
  t_6 = _7 ? 1 : _ifc__42;
which I believe should be sufficient for a PHI <1, t_13, 0>.

I've gathered some statistics and on x86_64-linux and i686-linux
bootstraps/regtests, this code triggers:
     92 4 4
    112 2 4
    141 3 4
   4046 3 3
(where 2nd number is args_len and 3rd argument EDGE_COUNT (bb->preds)
and first argument count of those from sort | uniq -c | sort -n).
In all these cases the patch should squeze one extra COND_EXPR and
its associated predicate (the latter only if it wasn't used elsewhere).

Incrementally, I think we should try to perform some analysis on which
predicates depend on inverses of other predicates and if possible try
to sort the arguments better and omit testing unnecessary predicates.
So essentially for the above testcase deconstruct it back to:
  _7 = a_10 < 0;
  _22 = a_10 < e_11(D);
  _ifc__42 = _22 ? t_13 : 0;
  t_6 = _7 ? 1 : _ifc__42;
which is like what this patch produces, but with the & a_10 >= 0 part
removed, because the last predicate is a_10 < 0 and so testing a_10 >= 0
on what appears on the false branch doesn't make sense.
But I'm afraid that will take more work than is doable in stage4 right now.

2023-04-15  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/109154
	* tree-if-conv.cc (predicate_scalar_phi): For complex PHIs, emit just
	args_len - 1 COND_EXPRs rather than args_len.  Formatting fix.
We were assuming that the result of evaluation of TARGET_EXPR_INITIAL would
always be the new value of the temporary, but that's not necessarily true
when the initializer is complex (i.e. target_expr_needs_replace).  In that
case evaluating the initializer initializes the temporary as a side-effect.

	PR c++/109357

gcc/cp/ChangeLog:

	* constexpr.cc (cxx_eval_constant_expression) [TARGET_EXPR]:
	Check for complex initializer.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/constexpr-dtor15.C: New test.
2023-04-15  John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog:

	PR target/104989
	* config/pa/pa-protos.h (pa_function_arg_size): Update prototype.
	* config/pa/pa.cc (pa_function_arg): Return NULL_RTX if argument
	size is zero.
	(pa_arg_partial_bytes): Don't call pa_function_arg_size twice.
	(pa_function_arg_size): Change return type to int.  Return zero
	for arguments larger than 1 GB.  Update comments.
gcc/ada/
	PR bootstrap/109510
	* gcc-interface/decl.cc (gnat_to_gnu_entity) <types>: Do not reset
	align to zero in any case.  Set TYPE_USER_ALIGN on the type only if
	it is an aggregate type, or else a type whose default alignment is
	specifically capped on selected platforms.
PR target/54816 is now fixed on mainline.  This adds a test case to
check that it doesn't regress in future.  Tested with a cross compiler
to avr-elf.  Committed as obvious.

2023-04-16  Roger Sayle  <roger@nextmovesoftware.com>

gcc/testsuite/ChangeLog
	PR target/54816
	* gcc.target/avr/pr54816.c: New test case.
Recently the conditional move expander's predicates were loosened for the
benefit of the THEAD processors.  In particular one operand that was
previously "register_operand" is now "reg_or_0_operand".  That's fine for
THEAD, but breaks for SFB which requires a register for that operand.

This results in an ICE when compiling the testcase an SFB target such as
the sifive s76.

This change adjusts the expansion code slightly to copy the value into
a register for SFB.

Bootstrapped and regression tested (c,c++,fortran only) with a toolchain
configured to enable SFB by default.

	PR target/109508
gcc/

	* config/riscv/riscv.cc (riscv_expand_conditional_move): For
	TARGET_SFB_ALU, force the true arm into a register.

gcc/testsuite
	* gcc.target/riscv/pr109508.c: New test.
There are sorts of shortcut codegen for the RVV mask insn. For
example.

vmxor vd, va, va => vmclr vd.

We would like to add more optimization like this but first of all
we must add the tests for the existing shortcut optimization, to
ensure we don't break existing optimization from underlying shortcut
optimization.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/mask_insn_shortcut.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/scalar_move-2.c: Adjust include way
	for riscv_vector.h
	* gcc.target/riscv/rvv/base/spill-sp-adjust.c: Add missing
	-mabi.
Hi,

As PR108809 mentioned, vec_xl_len_r and vec_xst_len_r are tested
in gcc.target/powerpc/builtins-5-p9-runnable.c.
The vector operand of these two bifs are different from the view
of v16_int8 between BE and LE, even it is same from the view of
128bits(uint128/V1TI).

The test case gcc.target/powerpc/builtins-5-p9-runnable.c was
written for LE environment, this patch updates it for BE.

Tested on ppc64 BE and LE.
Is this ok for trunk?

BR,
Jeff (Jiufu)

gcc/testsuite/ChangeLog:

	PR testsuite/108809
	* gcc.target/powerpc/builtins-5-p9-runnable.c: Update for BE.
VRP queues edges to process late for updating global ranges for
__builtin_unreachable.  But this interferes with edge removal
from substitute_and_fold.  The following deals with this by
looking up the edge with source/dest block indices which do not
become stale.

	PR tree-optimization/109524
	* tree-vrp.cc (remove_unreachable::m_list): Change to a
	vector of pairs of block indices.
	(remove_unreachable::maybe_register_block): Adjust.
	(remove_unreachable::remove_and_update_globals): Likewise.
	Deal with removed blocks.

	* g++.dg/pr109524.C: New testcase.
With
make check-gcc check-g++ -j32 -k RUNTESTFLAGS='--target_board=unix\{-m32,-m32/-mavx,-m32/-mavx512f,-m32/-march=cascadelake,-m64,-m64/-mavx,-m64/-mavx512f,-m64/-march=cascadelake\}
+vect.exp=vect-simd-clone*'
the vect-simd-clone-1[678]f.c tests fail with -m32/-mavx512f and -m32/-march=cascadelake,
in that case there are zero matches rather than the 4 expected for ia32.
-m64/-mavx512f and -m64/-march=cascadelake works fine though (2 expected
matches).

So, the following patch just adds -mno-avx512f for x86 non-lp64.

2023-04-17  Jakub Jelinek  <jakub@redhat.com>

	* gcc.dg/vect/vect-simd-clone-16f.c: Add -mno-avx512f for non-lp64 x86.
	* gcc.dg/vect/vect-simd-clone-17f.c: Likewise.
	* gcc.dg/vect/vect-simd-clone-18f.c: Likewise.
AmpereOne (-mcpu=ampere1) breaks LDP instructions into two uops.
Given the chance that this causes instructions to slip into the next
decoding cycle and the additional overheads when handling
cacheline-crossing LDP instructions, we disable the generation of LDP
isntructions through the tuning structure from instruction combining
(such as in peephole2).

Given the code-density benefits in builtins and prologue/epilogue
expansion, we allow LDPs there.

This commit:
 * adds a new tuning option AARCH64_EXTRA_TUNE_NO_LDP_COMBINE
 * allows -moverride=tune=... to override this

These changes are benchmark-driven, yielding the following changes
(with a net-overall improvement):
   503.bwaves_r.      -0.88%
   507.cactuBSSN_r     0.35%
   508.namd_r          3.09%
   510.parest_r       -2.99%
   511.povray_r        5.54%
   519.lbm_r          15.83%
   521.wrf_r           0.56%
   526.blender_r       2.47%
   527.cam4_r          0.70%
   538.imagick_r       0.00%
   544.nab_r          -0.33%
   549.fotonik3d_r.   -0.42%
   554.roms_r          0.00%
   -------------------------
   = total             1.79%

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Co-Authored-By: Di Zhao <di.zhao@amperecomputing.com>

gcc/ChangeLog:

	* config/aarch64/aarch64-tuning-flags.def (AARCH64_EXTRA_TUNING_OPTION):
	Add AARCH64_EXTRA_TUNE_NO_LDP_COMBINE.
	* config/aarch64/aarch64.cc (aarch64_operands_ok_for_ldpstp):
	Check for the above tuning option when processing loads.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/ampere1-no_ldp_combine.c: New test.
…69, PR 109318)

It turns out that since addition of the code that can identify globals
which are only read from, the code that keeps track of the references
can decrement their count for the same calls, once during IPA-CP and
then again during inlining.  Fixed by adding a special flag to the
pass-through variant and simply wiping out the reference to the
refdesc structure from the constant ones.

Moreover, during debugging of the issue I have discovered that the
code removing references could remove a reference associated with the
same statement but of a wrong type.  In all cases it wanted to remove
an IPA_REF_ADDR reference so removing a lesser one instead should do
no harm in practice, but we should try to be consistent and so this
patch extends symtab_node::find_reference so that it searches for a
reference of a given type only.

gcc/ChangeLog:

2023-04-14  Martin Jambor  <mjambor@suse.cz>

	PR ipa/107769
	PR ipa/109318
	* cgraph.h (symtab_node::find_reference): Add parameter use_type.
	* ipa-prop.h (ipa_pass_through_data): New flag refdesc_decremented.
	(ipa_zap_jf_refdesc): New function.
	(ipa_get_jf_pass_through_refdesc_decremented): Likewise.
	(ipa_set_jf_pass_through_refdesc_decremented): Likewise.
	* ipa-cp.cc (ipcp_discover_new_direct_edges): Provide a value for
	the new parameter of find_reference.
	(adjust_references_in_caller): Likewise. Make sure the constant jump
	function is not used to decrement a refdec counter again.  Only
	decrement refdesc counters when the pass_through jump function allows
	it.  Added a detailed dump when decrementing refdesc counters.
	* ipa-prop.cc (ipa_print_node_jump_functions_for_edge): Dump new flag.
	(ipa_set_jf_simple_pass_through): Initialize the new flag.
	(ipa_set_jf_unary_pass_through): Likewise.
	(ipa_set_jf_arith_pass_through): Likewise.
	(remove_described_reference): Provide a value for the new parameter of
	find_reference.
	(update_jump_functions_after_inlining): Zap refdesc of new jfunc if
	the previous pass_through had a flag mandating that we do so.
	(propagate_controlled_uses): Likewise.  Only decrement refdesc
	counters when the pass_through jump function allows it.
	(ipa_edge_args_sum_t::duplicate): Provide a value for the new
	parameter of find_reference.
	(ipa_write_jump_function): Assert the new flag does not have to be
	streamed.
	* symtab.cc (symtab_node::find_reference): Add parameter use_type, use
	it in searching.

gcc/testsuite/ChangeLog:

2023-04-06  Martin Jambor  <mjambor@suse.cz>

	PR ipa/107769
	PR ipa/109318
	* gcc.dg/ipa/pr109318.c: New test.
	* gcc.dg/lto/pr107769_0.c: Likewise.
@tschwinge tschwinge merged commit ddde0cf into master Apr 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.